Skip to content

Add random seed support for reproducible workflow runs #246

@dtch1997

Description

@dtch1997

Problem

Currently, when running replications in a parameter sweep (e.g., with suffixes rep-1, rep-2, rep-3), there's no way to control the random seed to ensure reproducibility or to intentionally vary random initialization across replications.

Proposed Solution

Add support for specifying random seeds in workflow configurations, likely in the submit_training config:

submit_training=SubmitTrainingConfig(
    model="meta-llama/Llama-3.2-1B",
    hyperparameters={
        "seed": 42,  # or random_seed
        ...
    },
    ...
)

This would allow:

  1. Exact reproducibility when using the same seed
  2. Controlled variation across replications by sweeping over different seeds
  3. Better experimental control

Example Use Case

param_grid = {
    "submit_training.hyperparameters.learning_rate": [1e-4, 5e-5, 1e-5],
    "submit_training.hyperparameters.seed": [42, 43, 44],  # 3 seeds for replications
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions