8. Psychoacoustics (Perception Layer) · Concept 8 of 18
Auditory Stream Segregation
Your brain splitting a sound stream into separate ongoing lines you can follow one at a time.
Your brain sorts one mixed sound into separate streams by pitch range - panning and EQ help each part stay its own clear line.
What it is
Your brain auto-sorting one tangled sound into separate ongoing "lines" you can follow one at a time.
Key facts
Term from Albert Bregman (1990), 'Auditory Scene Analysis' - the founding text
Two modes: SEQUENTIAL streaming (notes into lines over time) vs SIMULTANEOUS grouping (frequencies fused into one source)
Pitch proximity is cue #1: notes close in pitch = ONE stream; far apart + faster tempo = SPLIT into two
Onset synchrony: partials starting within ~30 ms of each other fuse into ONE sound; stagger them and they split
Harmonicity: tones whose partials are whole multiples of one fundamental (f, 2f, 3f...) fuse; mistune one ~1-3% and it pops out
Spatial cues (pan/ITD-ILD) help grouping but are WEAK - pitch and timbre usually win
ITD (Interaural Time Difference) max ~0.6 ms, dominates below ~1.5 kHz; ILD (level diff) dominates above ~1.5 kHz
Critical-band masking: sources within the same ~1/3-octave band fight for one stream and mask each other
+6 dB = double pressure (×2 amplitude); +10 dB = ~twice as loud; -3 dB = half power; -6 dB = half pressure
'Cocktail party effect' (Cherry, 1953) is streaming in action; it BUILDS UP over ~4-10 s and resets after ~1-2 s of silence
How it works
Ear splits the incoming wave into frequency bands (cochlea = biological spectrum analyser).
Brain groups bands that share pitch range, start together, and move together in time.
Sounds close in pitch and rhythm get welded into ONE perceptual stream.
Sounds far apart in pitch/timbre split into SEPARATE parallel streams.
Attention spotlights ONE stream while the others fade to background.
Conflicting cues (pitch says fuse, space says split) get weighed - strongest cue wins.
Real examples
Following the bass line under a full band while everything else carries on.
Picking out one conversation across a loud room (cocktail party effect).
Hearing a violin and flute on the same melody as TWO instruments, not one blended sound.
A fast piano run that 'splits' into two melodies when high and low notes pull apart in pitch.
Tracking the hi-hat pattern separately from kick and snare in a busy drum loop.
How it helps in live sound
PAN parts apart: centre kick/bass/lead vocal, spread guitars/keys/BVs 30-100% L/R so each gets its own stream.
EQ-carve pitch ranges: high-pass guitars ~120 Hz, dip ~300-500 Hz where they fight the bass.
Stagger transients - tighten/gate so two parts don't start in the same ~30 ms window and fuse into mud.
Give each instrument a unique timbre zone (boost its signature band) so the brain can tag it.
Reverb/delay pushes a part to its OWN background stream - keep the lead vocal dry and up front.
Watch the same 1/3-octave band: two sources there mask each other - move one in pitch or pan to break the clash.
Everyday analogy
Like untangling several coloured threads from one knot - your ear grabs each colour (pitch range) and follows it across the whole song.
Watch out
Myth: panning alone separates everything. Reality: pitch and timbre cues beat space - two parts in the same frequency band still fuse and mask even hard-panned apart, so EQ-separate them too.
Fun fact
Stream segregation 'builds up' - the longer you listen the more clearly two streams split, but just 1-2 seconds of silence resets your brain back to hearing them as one.
Key takeaways
Your brain splits one sound wave into separate followable lines - automatically.
Pitch proximity + shared rhythm = fuse into ONE stream; far apart = SPLIT into two.
Onset timing (~30 ms) and harmonicity decide if frequencies fuse into one source.
Pitch and timbre beat spatial cues - panning helps but EQ separation matters more.
In a mix, use pan + EQ + timing + reverb to give every part its OWN clear stream.