checkpoint
optimus_dl.recipe.pretokenize.checkpoint
¶
Manages saving and loading of data preparation checkpoints.
CheckpointManager
¶
Handles the loading and saving of checkpoints to ensure atomicity.
Source code in optimus_dl/recipe/pretokenize/checkpoint.py
clean()
¶
load()
¶
Loads the processing state from disk if a checkpoint exists.
Returns:
| Type | Description |
|---|---|
CheckpointState | None
|
The loaded CheckpointState, or None if no valid checkpoint is found. |
Source code in optimus_dl/recipe/pretokenize/checkpoint.py
save(state)
¶
Saves the current processing state to disk atomically.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
state
|
CheckpointState
|
The checkpoint state object to save. |
required |
Source code in optimus_dl/recipe/pretokenize/checkpoint.py
CheckpointState
dataclass
¶
Represents the state to be saved in a checkpoint.
This provides a clear structure for what is being saved and loaded.
Attributes:
| Name | Type | Description |
|---|---|---|
rng_state |
dict[str, Any]
|
Random number generator state (from |
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
processor_state
|
dict[str, Any]
|
|
required |
sharder_state
|
dict[str, Any]
|
|
required |