Report number of batches in a spectrum dataset #60

bittremieux · 2024-07-24T07:39:52Z

Useful for timing estimates in the progress bar.

codecov · 2024-07-24T07:42:12Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.50%. Comparing base (486221c) to head (e188d80).

Additional details and impacted files

@@           Coverage Diff           @@
##             main      #60   +/-   ##
=======================================
  Coverage   97.49%   97.50%           
=======================================
  Files          24       24           
  Lines         957      960    +3     
=======================================
+ Hits          933      936    +3     
  Misses         24       24

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

bittremieux · 2024-07-24T08:27:26Z

Hmm, using this a bit more, it actually gives the following warning:

WARNING: UserWarning: Your IterableDataset has __len__ defined. In combination with multi-process data loading (when num_workers > 1), __len__ could be inaccurate if each worker is not configured independently to avoid having duplicate data.

So maybe not an ideal tweak in the end. Is the SpectrumDataset compatible with getting accurate timing estimates from the PyTorch Lightning progress bar (for which the number of batches is needed)?

bittremieux · 2024-07-24T08:37:29Z

Addition: because SpectrumDataset is an IterableDataset, it also doesn't support shuffling. We might want to do this during training.

Report number of batches in the dataset

e188d80

Useful for timing estimates in the progress bar.

bittremieux requested a review from wfondrie July 24, 2024 07:39

bittremieux mentioned this pull request Jul 24, 2024

Best practices for spectrum data loading #62

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Report number of batches in a spectrum dataset #60

Report number of batches in a spectrum dataset #60

Uh oh!

bittremieux commented Jul 24, 2024

Uh oh!

codecov bot commented Jul 24, 2024

Uh oh!

bittremieux commented Jul 24, 2024

Uh oh!

bittremieux commented Jul 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Report number of batches in a spectrum dataset #60

Are you sure you want to change the base?

Report number of batches in a spectrum dataset #60

Uh oh!

Conversation

bittremieux commented Jul 24, 2024

Uh oh!

codecov bot commented Jul 24, 2024

Codecov Report

Uh oh!

bittremieux commented Jul 24, 2024

Uh oh!

bittremieux commented Jul 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants