base
optimus_dl.modules.data.datasets.strategies.base
¶
BaseStrategy
¶
Bases: ABC
Base class for dataset sampling strategies.
Source code in optimus_dl/modules/data/datasets/strategies/base.py
get_state()
abstractmethod
¶
Get state for checkpointing.
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Dictionary containing the current state. |
initialize(doc_lengths)
¶
next_sample()
abstractmethod
¶
Yield the next sample.
Returns:
| Type | Description |
|---|---|
list[tuple[int, tuple[int, int]]]
|
A list of segments required to construct the sample. |
list[tuple[int, tuple[int, int]]]
|
Each segment is a tuple: (doc_id, (start_offset, end_offset)). |
list[tuple[int, tuple[int, int]]]
|
|
list[tuple[int, tuple[int, int]]]
|
|
list[tuple[int, tuple[int, int]]]
|
|
Raises:
| Type | Description |
|---|---|
StopIteration
|
When the strategy is exhausted. |
Source code in optimus_dl/modules/data/datasets/strategies/base.py
reset(initial_state=None)
abstractmethod
¶
Reset state to initial or checkpointed state.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
initial_state
|
dict[str, Any] | None
|
State dictionary to restore from (optional). |
None
|