-
Notifications
You must be signed in to change notification settings - Fork 55
Closed
Milestone
Description
In order to ensure that progress is not completely lost if there is an error during a long-run data generation process, add a checkpointing system that saves intermediate states during data generation. This allows for more granular progress tracking and recovery.
- Add capability to library -- Add data checkpointing capability #222
- Enable checkpointing in the CLI -- Adding checkpoint dir when batching enabled instructlab#1915
- Enable batching in CLI (checkpointing is coupled with batching) Enable SDG batching with vLLM instructlab#1797
- File an issue to consider making use of
save_freqin future -- checkpointing: consider allowing users to specify save frequency #225
Metadata
Metadata
Assignees
Labels
No labels