Module audio

Module audio 

Source
Expand description

Audio processing and analysis.

The MilkDrop preset surface (per-frame / per-pixel equations, beat-detect thresholds, custom waves’ volume modulation) reads three split-band volumes: bass, mid, treb. They must reflect frequency content — kick drums push bass, cymbals push treb — not just overall loudness. The original placeholder in this file did RMS over three time-domain chunks of the input window, so all three bands tracked overall loudness and presets couldn’t distinguish low- from high-frequency energy. This analyzer now runs an FFT and integrates the magnitude spectrum over the MD2-standard frequency ranges (FFTAnalyzer::get_bass / _mid / _treble).

Structs§

AudioAnalyzer
Audio analyzer for extracting per-band volumes.

Constants§

AGC_AVG_ALPHA 🔒
Smoothing factor for the AGC’s per-band running average. Closer to 1.0 = longer history. ~50-100 frames at 60 FPS feels right — slow enough that a single kick doesn’t dump the gain, fast enough that a track change resettles within a couple of seconds.
AGC_GAIN_MAX 🔒
AGC_GAIN_MIN 🔒
Clamp the AGC gain to a sane range. Both bounds matter: a track with absolute silence on a band shouldn’t let the gain run to infinity (the floor at 0.5 caps the boost on near-silent bands), and a band that’s pure-tone-driven shouldn’t crush below 1/10 of the raw signal.
AGC_TARGET_LEVEL 🔒
Target band level the AGC steers each band toward. MD2 presets are authored against the convention that bass/mid/treb average ~1.0 on typical content; bass > 1.5 then unambiguously means “louder than usual” (kick / snare / hat) rather than just “loud track”. With AGC off, quiet tracks float around 0.2-0.4 and never trigger beat-reactive presets; loud tracks pin at 8.0 (the clamp) and beat detection saturates.
FFT_BAND_GAIN 🔒
Maps the FFT-band averages to roughly the [0, 2-ish] range MD2 presets expect (bass > 1.5 is the conventional “loud kick” threshold for BeatDetectionMode::HardCut1). A unit-amplitude sine in the bass range produces an averaged magnitude of ~0.5 / num_bass_bins (the FFT spreads one bin’s peak across the averaging window); with the default 1024-point FFT @ 44.1 kHz that’s ~0.08, so gain 12.5 maps a pure bass sine to ~1.0. Real-world music with a bass kick lands in the ~0.5-2.0 range with 50.0, which lines up reasonably with MD2’s expected dynamics. Tunable per-user later via the GUI if needed.
FFT_SIZE 🔒
FFT window length. Powers of 2; 1024 gives ~43 Hz bin width @ 44.1 kHz, which is enough granularity for the bass band (smaller windows under-resolve 20-250 Hz; larger windows raise latency without buying much for the 3-band split MD2 uses).