8. Psychoacoustics (Perception Layer) · Concept 5 of 18
Frequency Masking
When a loud sound hides a quieter sound that sits at a nearby pitch.
Loud synth's masking curve (amber) swamps the quiet vocal (blue) just above it in pitch, both fire the same cochlea patch, so the vocal goes inaudible until you EQ and pan them apart.
What it is
A loud sound makes a quieter sound at a nearby pitch impossible to hear, because both land on the same spot in your inner ear.
Key facts
Masking = a loud 'masker' raises the hearing threshold for a quieter 'maskee' at a similar frequency, so the quiet one vanishes.
The cochlea is tonotopic: ~3,500 inner hair cells along a 35 mm spiral, high pitch at the base, low pitch at the apex. Same patch = same fight.
Hearing splits 20 Hz to 20 kHz into ~24 overlapping critical bands (Bark scale). Two sounds inside ONE band mask each other hardest. Band width ~100 Hz below 500 Hz, then ~15 to 20 percent of centre frequency higher up.
Masking is ASYMMETRIC: bass masks treble far more than the reverse. 'Upward spread of masking' grows with masker level (the curve smears UP in frequency).
A 60 dB SPL masker can lift the threshold of a tone an octave above by 20 to 40 dB; a tone an octave below is barely touched.
Temporal masking too: forward masking lasts ~100 to 200 ms after the masker stops; backward masking ~5 to 20 ms before it starts.
Near-identical tones (within one critical band) don't mask, they BEAT: beat rate (Hz) = |f1 - f2|. 440 Hz vs 443 Hz = 3 beats per second.
dB SPL = 20 x log10(P / P0), where P is measured sound pressure and P0 = 20 micropascals (0 dB hearing threshold at 1 kHz).
Loudness: +10 dB sounds roughly TWICE as loud; +6 dB doubles sound PRESSURE; +3 dB doubles POWER (-3 dB = half power, the half-power point).
Speed of sound ~343 m/s at 20 C. Wavelength (m) = 343 / frequency (Hz): 100 Hz = 3.43 m, 1 kHz = 0.34 m, 10 kHz = 34 mm. MP3/AAC codecs exploit masking, deleting sound below the masking threshold to shrink files.
How it works
Pick the two clashing sources (e.g. lead vocal vs synth pad) and find the frequency range they share.
Decide who wins that range: the most important source keeps it, the other gives way.
EQ a complementary cut: scoop the loser 3 to 6 dB where the winner needs to live; optionally boost the winner there.
Pan the two sources apart so they no longer share the exact same patch of attention.
High-pass anything that doesn't need low end (vocals ~80 to 120 Hz) to free low-end headroom.
A/B in mono and at low volume: if a part disappears, masking still wins, so repeat.
Real examples
Kick drum (~60 Hz) and bass guitar (~80 to 250 Hz) smother each other until you EQ a hole in the bass for the kick.
Lead vocal vs a thick synth pad both at 1 to 3 kHz: the pad swallows vocal clarity until you carve it.
Acoustic guitar and piano both filling 200 Hz to 2 kHz midrange turn to mud in a busy mix.
Crash cymbal masking a hi-hat or shaker because both occupy 8 to 16 kHz.
Crowd noise (broadband) masking quiet speech, so you push the vocal +6 dB or more to punch through.
How it helps in live sound
Complementary EQ: cut the kick channel at ~250 Hz and the bass channel at ~60 to 80 Hz so each owns its band.
High-pass everything you can: vocals 80 to 120 Hz, guitars 100 Hz, cymbals 300 Hz, to clear masking clutter in the lows.
Pan competing mids apart (rhythm guitars hard L/R) so a centred vocal isn't masked.
Watch your RTA / spectrogram: stacked energy bumps in one band = masking happening live, carve one.
Tame sub-bass first: upward spread means too much low end will mask vocals and cymbals.
If two channels beat (chorus-y wobble), they're within a critical band, retune the source, EQ won't fix it.
Everyday analogy
A bright torch shone next to a dim torch makes the dim one invisible, even though it's still on, because your eye is overwhelmed by the bright one right beside it.
Watch out
Myth: 'just turn the masked instrument up'. Truth: louder often makes it worse via upward spread of masking, you must CARVE its frequency space with EQ or pan it away, not just add gain.
Fun fact
Every MP3, AAC and streaming track is built on masking: the codec measures the masking threshold and throws away the sounds hiding underneath it, which is how a song shrinks 10x with no obvious loss.
Key takeaways
Two sounds at nearby pitches fight; the louder one hides the quieter one.
They clash because they land on the same patch of the cochlea (same critical band, ~24 bands total).
Bass masks treble far more than the reverse, masking spreads UPWARD in frequency.
Fix it with complementary EQ (cut the loser where the winner lives) and panning, not just volume.
Same-band near-identical pitches beat instead of mask: beat rate = the frequency difference in Hz.
MP3/AAC compression is masking turned into a tool: inaudible masked sound is deleted to save space.