-
Notifications
You must be signed in to change notification settings - Fork 286
feat: add audio dithering support #794
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces audio dithering support to Rodio with four dithering algorithms, while also implementing significant infrastructure improvements across decoders and sources.
Key Changes:
- Audio dithering feature: Adds TPDF, RPDF, GPDF, and HighPass dithering algorithms for quantization noise management
- BitDepth type addition: New type for tracking audio bit depth throughout the audio pipeline
- Decoder improvements: Enhanced seeking, error handling, and configuration for WAV and Vorbis decoders
- Source trait expansion: Added
bits_per_sample()method and improved seeking error handling
Reviewed Changes
Copilot reviewed 89 out of 91 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/source/dither.rs | New dithering implementation with four algorithms (TPDF, RPDF, GPDF, HighPass) |
| src/common.rs | Adds BitDepth type and updates ChannelCount/SampleRate handling |
| src/decoder/wav.rs | Major WAV decoder improvements with better seeking and configuration |
| src/decoder/vorbis.rs | Enhanced Vorbis decoder with granule-based seeking and duration scanning |
| src/source/mod.rs | Expanded Source trait with bits_per_sample() method and improved SeekError variants |
| tests/ | Updated test files to use Path-based decoder APIs instead of File objects |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
src/decoder/vorbis.rs
Outdated
| // Try to find a granule from this position (limited packet scan during binary search) | ||
| match find_granule_from_position(data, mid, Some(50)) { |
Copilot
AI
Sep 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The magic number 50 for packet limit should be defined as a named constant to improve code readability and maintainability.
src/decoder/wav.rs
Outdated
| }; | ||
| // len is number of samples, not bytes, so use samples_to_duration | ||
| // Note: hound's len() returns total samples across all channels | ||
| let samples_per_channel = len / (channels as u64); |
Copilot
AI
Sep 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Division by channels as u64 could cause integer truncation issues. The conversion from u16 to u64 for channels should be explicit and the division should handle potential remainders properly.
| let samples_per_channel = len / (channels as u64); | |
| let channels_u64 = channels as u64; | |
| if len % channels_u64 != 0 { | |
| eprintln!( | |
| "Warning: total number of samples ({}) is not evenly divisible by number of channels ({}). Truncating to {} samples per channel.", | |
| len, channels, len / channels_u64 | |
| ); | |
| } | |
| let samples_per_channel = len / channels_u64; |
😅 😅 For a minute or two I thought you introduced another ~6k PR. Btw lets add a new rule/convention to keep PR's under 1500 lines. I'll hold myself to that for rewriting the entire audio pipeline. |
cd1b454 to
01714ce
Compare
8268179 to
14e63ca
Compare
|
Rebased on master. |
|
|
||
| /// Dither algorithm selection - chooses the probability density function (PDF). | ||
| #[derive(Debug, Default, Clone, Copy, PartialEq, Eq)] | ||
| pub enum Algorithm { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've spend some time trying to learn about these. I find it quite hard to find info on these. Would love to know (out of intrest, this is not blocking the PR in any way) how you got to these docs. Is it something you've learned over the years?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you will love reading Wannamaker's PhD seminal dissertation on this. I believe this is a shorter IEEE extract of it, but it's been years since I read all these. Yes, I actually dove into it like four years ago reading these, as well as source code of GStreamer, PortAudio and others.
Beyond high-passed dither there are more elaborate noise-shaping techniques that I may propose in a later PR. Emphasis on may.
src/source/dither.rs
Outdated
| // LSB amplitude for signed audio: 1.0 / (2^(bits-1)) | ||
| // This represents the amplitude of one quantization level | ||
| let lsb_amplitude = if target_bits.get() >= Sample::MANTISSA_DIGITS { | ||
| // For bit depths at or beyond the floating point precision limit, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the idea to not do the LSB amplitude calc. as a performance improvement? Or do we need to skip it because we get artifacting otherwise?
My gut tells me that the extra branch from the if will be more expensive then the calculation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This branch is in the constructor, not the hot path.
The point is that some sources may report a bit depth of 32 that users would want to dither to for requantization (remember, this would be great to use with the Source::bits_per_sample() that #786 introduced). Because our audio pipeline in the end will be constrained to 24 bits, we then still want to select the LSB and dither before truncating.
src/source/dither.rs
Outdated
|
|
||
| /// Dithering interface delegating to the supported dithering algorithms. | ||
| #[derive(Clone)] | ||
| pub enum Dither<I, R = SmallRng> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we merge this and the DitherImpl struct? That would get rid of the match in channels, sample_rate etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would that look like while still supporting the various PDFs and so, different noise generators? I want to prevent branching in the iterator. This delegation is faster, particularly when the algorithm is selected at compile time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This delegation is faster, particularly when the algorithm is selected at compile time.
Thats surprising since there is a match in the iterator. That will branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This delegation is faster
Do we have a benchmark to back this up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This delegation is faster, particularly when the algorithm is selected at compile time.
Thats surprising since there is a match in the iterator. That will branch.
match creates jump tables which are O(1). if branches are O(n).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
match creates jump tables which are O(1). if branches are O(n).
On a modern optimizing compiler this is no longer true. But its a moot point as the number of comparisons is not at all what can kill performance with branches.
Branches prevent in-lining and a ton of other essential compiler optimizations. Further more if the CPU's branch predictor makes an incorrect prediction the entire pipeline needs to be flushed.
For a for loop (basically a loop { if i > num break else do things} }this is not an issue since the branch predictor can figure out that if statement will usually be true. That means only once we go into the else we pay the price.
But as always compilers and CPU's have become far to complex for anyone to reason about performance in detail. So optimize for readability and benchmark after.
yara-blue
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can make source/dither smaller by merging the Dither and DitherImpl structs. Let me know what you think.
No comments on the rest, looks perfect.
Thanks for the review - please check out my question about that. I'm not sure. |
I'll give a try and then we can compare 👍 |
|
I pushed a messy idea/prototype (we got a builder again). This gets rid of the branching and should end up a lot shorter. It does make it impossible to change dithering method on the fly but I dont think you would want to do that right? Feel free to revert if we decide not to go this route. |
|
I do want the ergonomics of I don't immediately have a strong feeling either way. What do you say looking at that source-and-effect chain? |
New idea I'll have to work it out. I'll send another very ugly commit your way. |
|
Curious what you think, I quite like it. No branching on algorithm shorter code and no builder 🎉 |
|
Your proposal is clever, however cannot select a PDF at runtime. There sure would be reasons why someone would want that, like in a dialog or on the command line to configure an application’s behavior. I benchmarked the branching and monomorphization approaches with |
Also replace f32 with Sample where appropriate for consistency
…elvet density to NonZero
- Replace trait-based DitherAlgorithm with runtime enum Algorithm - Implement Dither as a single struct supporting all algorithms - Add NoiseGenerator enum for dynamic noise generation - Expose new dither() function using Algorithm and BitDepth - Update docs and tests for new API
e157735 to
8293813
Compare
The modern optimizing compiler, a true miracle.
👍 |
src/source/mod.rs
Outdated
| /// and when converting between bit depths. Apply at the target output bit depth. | ||
| #[cfg(feature = "dither")] | ||
| #[cfg_attr(docsrs, doc(cfg(feature = "dither")))] | ||
| pub fn dither<I>(input: I, target_bits: BitDepth, algorithm: DitherAlgorithm) -> Dither<I> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace this with a default implementation on source? So you can do sine().dither().amplify().pausable() etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
otherwise good to merge btw
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh that must have fallen off somewhere! Good catch, will re-add it and merge when done. Thanks.
I forgot to say "0.01ns" per sample, but yes. Besides the compiler there's speculative branch prediction in CPUs as well of course. You're right it's nigh impossible reason about it, and then it differs per architecture too. Though lately I'm making more effort to try and feed it code that at least should be easy to digest in the hot path. |
Builds on our noise generators to add four dithering algorithms:
Gated by new
ditherfeature flag, that's enabled by default.API
Parent branch
This is branched off of #786 which introduces
Source::bits_per_sample(), because you'll typically want to dither down to:I'll mark this PR for review when that PR is merged.