The rule for deciding how much file space to spend on each part of the sound so the limited space goes where ears notice most.
Bits flow from the fixed budget into the audible bands above the red masking curve; masked bands get starved.
What it is
The rule a codec uses to hand more data bits to the sound parts your ears notice and starve the parts they don't.
Key facts
Bit = 1 binary digit; 8 bits = 1 byte; bitrate measured in kbps; 1 kbps = 1000 bits/sec
PCM bitrate = sample rate (Hz) x bit depth x channels; CD = 44,100 x 16 x 2 = 1,411 kbps
MP3 goes near-transparent at approx 256-320 kbps (320 = MP3 max); AAC matches it at lower bitrate
Every +1 bit of depth adds 6.02 dB dynamic range: 16-bit = approx 96 dB, 24-bit = approx 144 dB
Quantisation SNR (dB) = 6.02 x N + 1.76, where N = bits per sample; more bits = lower noise
Masking threshold decides allocation: bits go where signal sits ABOVE the curve (audible), not below
Bit reservoir: codec banks spare bits from easy frames to fund hard frames (cymbals, transients)
MP3 works in frames of 1,152 samples; bits are allocated inside each frame's fixed budget
CBR = fixed budget per frame; VBR = budget shifts to busy passages; ABR = average target
Speed of sound = 343 m/s at 20 C; +6 dB = double pressure, +3 dB = double power, +10 dB = approx twice as loud
How it works
Codec splits the sound into many frequency bands (sub-bands / MDCT coefficients).
A psychoacoustic model calculates the masking threshold: how loud a sound must be to be heard over its neighbours.
Each band's signal is compared to that masking curve to get its signal-to-mask ratio (SMR).
Bands with high SMR (clearly audible) get more bits; masked bands get few or zero bits.
Bits are doled out iteratively until the frame's budget is spent, fixing the worst-sounding errors first.
Leftover bits from easy frames are stored in the bit reservoir and lent to demanding frames.
Real examples
A loud kick drum masks a quiet hi-hat tick right after it, so the codec spends near-zero bits on that buried tick.
A solo flute passage is 'easy', banks bits into the reservoir, ready for the cymbal crash two frames later.
Same 4-min song: 320 kbps MP3 approx 9.6 MB, 128 kbps MP3 approx 3.8 MB, raw WAV approx 42 MB.
Streaming a podcast: voice lives 100 Hz-8 kHz, so bits skip the empty 12-20 kHz top end entirely.
Harsh 'underwater' swirl on cymbals at 96 kbps = the codec ran out of bits for high-frequency detail.
How it helps in live sound
Record/print stems as 24-bit WAV (144 dB range) for headroom; only bounce to lossy at the very end.
For walk-in music and playback tracks use 320 kbps MP3 or 256 kbps+ AAC; never run a show off 128 kbps files.
Feed Bluetooth/streaming speakers a wired or high-bitrate source; cheap BT codecs re-compress and smear cymbals.
If a backing track sounds swirly or thin up top, it's low-bitrate artefacts, not your PA. Re-source the file.
Use WAV/FLAC (lossless, no bit-starving) for show-critical tracks; keep MP3 only for casual fill music.
Dense full-band tracks suffer most from low bitrate; sparse spoken-word survives it far better.
Everyday analogy
It's a tight household budget: you spend big on rent (loud, audible frequencies) and almost nothing on snacks (quiet, masked frequencies your ears miss anyway).
Watch out
Myth: 'higher bitrate = louder/clearer always.' Truth: above transparency (approx 256-320 kbps) extra bits do nothing audible; they fix masking errors, not volume.
Fun fact
MP3's secret weapon is the bit reservoir: it literally time-travels spare bits forward from quiet moments to fund the next loud transient, so a frame can borrow against its neighbours' leftovers.
Key takeaways
Bits are a fixed budget; allocation spends them where ears notice most.
The masking threshold decides who gets bits and who gets starved.
More bits per band = smaller quantisation steps = less audible noise.
Above transparency, more bitrate buys you nothing your ears can hear.
Lossless (WAV/FLAC) skips bit-starving entirely; keep it for show-critical audio.