Skip to content

Conversation

@roderickvd
Copy link
Member

@roderickvd roderickvd commented Sep 12, 2025

Builds on our noise generators to add four dithering algorithms:

  • TPDF (Triangular PDF) - Optimal decorrelation (default)
  • RPDF (Rectangular PDF) - Lower noise floor, worse decorrelation
  • GPDF (Gaussian PDF) - Higher noise floor, analog characteristics
  • HighPass - Blue noise to reduce low-frequency artifacts

Gated by new dither feature flag, that's enabled by default.

API

use rodio::source::{dither, DitherAlgorithm};
use rodio::BitDepth;

let source = SineWave::new(440.0);
let dithered = dither(source, BitDepth::new(16).unwrap(), DitherAlgorithm::TPDF);

Parent branch

This is branched off of #786 which introduces Source::bits_per_sample(), because you'll typically want to dither down to:

let output_bits = config.bits_per_sample();  // in cpal master: https://github.com/RustAudio/cpal/pull/1008
let target_bits = source_bits.map_or(output_bits, |src_bits| src_bits.min(output_bits));

I'll mark this PR for review when that PR is merged.

@roderickvd roderickvd changed the base branch from feat/comprehensive-seeking-and-bit-depth to master September 12, 2025 21:06
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces audio dithering support to Rodio with four dithering algorithms, while also implementing significant infrastructure improvements across decoders and sources.

Key Changes:

  • Audio dithering feature: Adds TPDF, RPDF, GPDF, and HighPass dithering algorithms for quantization noise management
  • BitDepth type addition: New type for tracking audio bit depth throughout the audio pipeline
  • Decoder improvements: Enhanced seeking, error handling, and configuration for WAV and Vorbis decoders
  • Source trait expansion: Added bits_per_sample() method and improved seeking error handling

Reviewed Changes

Copilot reviewed 89 out of 91 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/source/dither.rs New dithering implementation with four algorithms (TPDF, RPDF, GPDF, HighPass)
src/common.rs Adds BitDepth type and updates ChannelCount/SampleRate handling
src/decoder/wav.rs Major WAV decoder improvements with better seeking and configuration
src/decoder/vorbis.rs Enhanced Vorbis decoder with granule-based seeking and duration scanning
src/source/mod.rs Expanded Source trait with bits_per_sample() method and improved SeekError variants
tests/ Updated test files to use Path-based decoder APIs instead of File objects

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines 865 to 856
// Try to find a granule from this position (limited packet scan during binary search)
match find_granule_from_position(data, mid, Some(50)) {
Copy link

Copilot AI Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The magic number 50 for packet limit should be defined as a named constant to improve code readability and maintainability.

Copilot uses AI. Check for mistakes.
};
// len is number of samples, not bytes, so use samples_to_duration
// Note: hound's len() returns total samples across all channels
let samples_per_channel = len / (channels as u64);
Copy link

Copilot AI Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Division by channels as u64 could cause integer truncation issues. The conversion from u16 to u64 for channels should be explicit and the division should handle potential remainders properly.

Suggested change
let samples_per_channel = len / (channels as u64);
let channels_u64 = channels as u64;
if len % channels_u64 != 0 {
eprintln!(
"Warning: total number of samples ({}) is not evenly divisible by number of channels ({}). Truncating to {} samples per channel.",
len, channels, len / channels_u64
);
}
let samples_per_channel = len / channels_u64;

Copilot uses AI. Check for mistakes.
@yara-blue
Copy link
Member

Parent branch

This is branched off of #786 which introduces Source::bits_per_sample(), because you'll typically want to dither down to:

😅 😅 For a minute or two I thought you introduced another ~6k PR.

Btw lets add a new rule/convention to keep PR's under 1500 lines. I'll hold myself to that for rewriting the entire audio pipeline.

@roderickvd roderickvd force-pushed the feat/dither branch 2 times, most recently from cd1b454 to 01714ce Compare September 17, 2025 20:55
@roderickvd roderickvd marked this pull request as ready for review September 17, 2025 21:50
@yara-blue yara-blue assigned yara-blue and unassigned yara-blue Sep 20, 2025
@roderickvd roderickvd force-pushed the feat/dither branch 2 times, most recently from 8268179 to 14e63ca Compare September 26, 2025 21:03
@roderickvd
Copy link
Member Author

Rebased on master.


/// Dither algorithm selection - chooses the probability density function (PDF).
#[derive(Debug, Default, Clone, Copy, PartialEq, Eq)]
pub enum Algorithm {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've spend some time trying to learn about these. I find it quite hard to find info on these. Would love to know (out of intrest, this is not blocking the PR in any way) how you got to these docs. Is it something you've learned over the years?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you will love reading Wannamaker's PhD seminal dissertation on this. I believe this is a shorter IEEE extract of it, but it's been years since I read all these. Yes, I actually dove into it like four years ago reading these, as well as source code of GStreamer, PortAudio and others.

Beyond high-passed dither there are more elaborate noise-shaping techniques that I may propose in a later PR. Emphasis on may.

// LSB amplitude for signed audio: 1.0 / (2^(bits-1))
// This represents the amplitude of one quantization level
let lsb_amplitude = if target_bits.get() >= Sample::MANTISSA_DIGITS {
// For bit depths at or beyond the floating point precision limit,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea to not do the LSB amplitude calc. as a performance improvement? Or do we need to skip it because we get artifacting otherwise?

My gut tells me that the extra branch from the if will be more expensive then the calculation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch is in the constructor, not the hot path.

The point is that some sources may report a bit depth of 32 that users would want to dither to for requantization (remember, this would be great to use with the Source::bits_per_sample() that #786 introduced). Because our audio pipeline in the end will be constrained to 24 bits, we then still want to select the LSB and dither before truncating.


/// Dithering interface delegating to the supported dithering algorithms.
#[derive(Clone)]
pub enum Dither<I, R = SmallRng>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we merge this and the DitherImpl struct? That would get rid of the match in channels, sample_rate etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would that look like while still supporting the various PDFs and so, different noise generators? I want to prevent branching in the iterator. This delegation is faster, particularly when the algorithm is selected at compile time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This delegation is faster, particularly when the algorithm is selected at compile time.

Thats surprising since there is a match in the iterator. That will branch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This delegation is faster

Do we have a benchmark to back this up?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This delegation is faster, particularly when the algorithm is selected at compile time.

Thats surprising since there is a match in the iterator. That will branch.

match creates jump tables which are O(1). if branches are O(n).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

match creates jump tables which are O(1). if branches are O(n).

On a modern optimizing compiler this is no longer true. But its a moot point as the number of comparisons is not at all what can kill performance with branches.

Branches prevent in-lining and a ton of other essential compiler optimizations. Further more if the CPU's branch predictor makes an incorrect prediction the entire pipeline needs to be flushed.

For a for loop (basically a loop { if i > num break else do things} }this is not an issue since the branch predictor can figure out that if statement will usually be true. That means only once we go into the else we pay the price.

But as always compilers and CPU's have become far to complex for anyone to reason about performance in detail. So optimize for readability and benchmark after.

Copy link
Member

@yara-blue yara-blue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can make source/dither smaller by merging the Dither and DitherImpl structs. Let me know what you think.

No comments on the rest, looks perfect.

@roderickvd
Copy link
Member Author

I think we can make source/dither smaller by merging the Dither and DitherImpl structs. Let me know what you think.

Thanks for the review - please check out my question about that. I'm not sure.

@yara-blue
Copy link
Member

Thanks for the review - please check out my question about that. I'm not sure.

I'll give a try and then we can compare 👍

@yara-blue
Copy link
Member

yara-blue commented Sep 27, 2025

I pushed a messy idea/prototype (we got a builder again). This gets rid of the branching and should end up a lot shorter. It does make it impossible to change dithering method on the fly but I dont think you would want to do that right? Feel free to revert if we decide not to go this route.

@roderickvd
Copy link
Member Author

I do want the ergonomics of sound.dither().amplify(). Sure, we could have Source::dither() select DitherBuilder::tpdf() by default. But I think builders make most sense when they are "point of origin" like with the decoder or microphone. I do agree that this proposal shaves off a lot of code and should be even faster.

I don't immediately have a strong feeling either way. What do you say looking at that source-and-effect chain?

@yara-blue
Copy link
Member

I don't immediately have a strong feeling either way. What do you say looking at that source-and-effect chain?

New idea I'll have to work it out. I'll send another very ugly commit your way.

@yara-blue
Copy link
Member

Curious what you think, I quite like it. No branching on algorithm shorter code and no builder 🎉

@roderickvd
Copy link
Member Author

roderickvd commented Sep 27, 2025

Your proposal is clever, however cannot select a PDF at runtime. There sure would be reasons why someone would want that, like in a dialog or on the command line to configure an application’s behavior.

I benchmarked the branching and monomorphization approaches with divan and the performance differences are negligible: 0.01ns delta on my Apple M3. I’m gonna propose a middle ground: remove the DitherInner enum and replace it with a NoiseGenerator enum that dispatches. It does save a lot of code.

dvdsk and others added 4 commits September 27, 2025 22:08
- Replace trait-based DitherAlgorithm with runtime enum Algorithm - Implement Dither as a single
struct supporting all algorithms - Add NoiseGenerator enum for dynamic noise generation - Expose new
dither() function using Algorithm and BitDepth - Update docs and tests for new API
@yara-blue
Copy link
Member

I benchmarked the branching and monomorphization approaches with divan and the performance differences are negligible: 0.01ns delta on my Apple M3.

The modern optimizing compiler, a true miracle.

Your proposal is clever, however cannot select a PDF at runtime. There sure would be reasons why someone would want that, like in a dialog or on the command line to configure an application’s behavior.

👍

/// and when converting between bit depths. Apply at the target output bit depth.
#[cfg(feature = "dither")]
#[cfg_attr(docsrs, doc(cfg(feature = "dither")))]
pub fn dither<I>(input: I, target_bits: BitDepth, algorithm: DitherAlgorithm) -> Dither<I>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace this with a default implementation on source? So you can do sine().dither().amplify().pausable() etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise good to merge btw

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh that must have fallen off somewhere! Good catch, will re-add it and merge when done. Thanks.

@roderickvd roderickvd merged commit d89514d into master Sep 27, 2025
9 checks passed
@roderickvd roderickvd deleted the feat/dither branch September 27, 2025 22:16
@roderickvd
Copy link
Member Author

I benchmarked the branching and monomorphization approaches with divan and the performance differences are negligible: 0.01ns delta on my Apple M3.

The modern optimizing compiler, a true miracle.

I forgot to say "0.01ns" per sample, but yes.

Besides the compiler there's speculative branch prediction in CPUs as well of course. You're right it's nigh impossible reason about it, and then it differs per architecture too. Though lately I'm making more effort to try and feed it code that at least should be easy to digest in the hot path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants