A built-in map of when a louder sound makes a nearby quieter sound impossible to hear, so the coder knows what it can safely delete.
A loud masker lifts the hearing threshold (blue curve); anything beneath it (amber) is inaudible and deleted for free, while audible peaks (green) keep their bits.
What it is
A coder's predictive map of when a loud sound makes a nearby softer sound inaudible, so the hidden sound can be deleted for free.
Key facts
Human hearing spans 20 Hz to 20,000 Hz; sensitivity peaks around 2-5 kHz
Masking THRESHOLD = the raised loudness level a sound must beat to still be heard once a louder masker is present
Frequency (simultaneous) masking: a loud tone hides quieter tones close in pitch, more upward in frequency than downward
Critical bands = ear's built-in channels; about 24 Bark bands across the audible range, masking strongest within one band
Temporal masking: PRE-masking approx 2-5 ms before the loud sound; POST-masking 50-200 ms after it
A 60 dB masker at 1 kHz can lift a nearby tone's threshold by 20-40 dB, making it vanish
SMR = Signal-to-Mask Ratio: how far a signal sits ABOVE its masking threshold; bits are spent only on positive SMR
CD PCM = 1411 kbps (44.1 kHz, 16-bit, 2 ch); MP3 at 128 kbps is approx 11:1 smaller
MP3, AAC, Ogg Vorbis, Opus, ATRAC (MiniDisc) and Dolby AC-3 all use psychoacoustic masking
Quantisation noise is hidden UNDER the masking curve, so you never hear the lost bits
How it works
Codec splits audio into many narrow frequency bands (critical bands / sub-bands).
It measures the loudest energy (the masker) in each band frame by frame.
It calculates the masking threshold curve: the level below which sound is inaudible.
Anything sitting under that curve is flagged as masked and gets few or zero bits.
It allocates the saved bits to audible parts and pushes quantisation noise under the curve.
Result: smaller file, no audible difference because deleted sound was already inaudible.
Real examples
Loud kick drum buries a quiet acoustic guitar in the same low-mid range, so the guitar detail is lost anyway.
Cymbal crash masks soft hi-hat ticks just before and after it (temporal masking).
A 1 kHz tone at 80 dB makes a 1.1 kHz tone at 40 dB completely inaudible.
MP3 deletes the masked tail of a snare reverb that the crash already hid.
Loud crowd noise at a gig hides a singer's quiet breath sounds in the mics.
How it helps in live sound
Do not waste EQ/fader effort chasing quiet detail a louder source already masks (e.g. ride under a guitar wall).
Carve overlapping instruments with EQ so they sit in DIFFERENT critical bands and stop masking each other.
High-pass filter to clear low-mid mud where kick and bass mask everything around 80-250 Hz.
Prefer lossless WAV/FLAC for show playback; reserve 256-320 kbps MP3/AAC for streaming only.
Gate or duck quiet sources that vanish under loud ones rather than fighting to make them heard.
Temporal masking hides reverb tails approx 100-200 ms after a loud transient, so trimming them is inaudible.
Everyday analogy
Like a truck roaring past kills your ability to hear a whisper beside you, the model maps exactly which quiet sounds the loud ones swallow so it can bin them for free.
Watch out
Myth: lossy codecs just chop off high frequencies. Truth: they delete masked sound across all frequencies, guided by the masking model, and keep audible highs.
Fun fact
A loud sound can mask a quieter one that has not even started yet: pre-masking erases sounds up to about 5 ms BEFORE the loud event because your brain processes the loud spike first.
Key takeaways
Loud sounds hide nearby quieter sounds in pitch AND in time.
Masking raises the threshold of hearing; anything under it is deleted for free.
Critical bands (approx 24 Bark) are the ear's channels where masking bites hardest.
Post-masking lasts 50-200 ms; pre-masking only approx 2-5 ms.
Codecs hide quantisation noise under the masking curve so loss is inaudible.
On a gig, do not chase detail nobody can hear under a louder source.