8. Psychoacoustics (Perception Layer) · Concept 12 of 18
Temporal Theory of Pitch
The idea that pitch is decided by how fast the sound's vibration repeats over time.
Nerves fire in step with each wave cycle; the brain reads the spike timing (period = 1/f) as pitch, dominant below ~1 kHz where place theory struggles.
What it is
Pitch worked out by how fast a sound's wave repeats over time, read from the timing of nerve firings.
Key facts
Pitch = perceived highness/lowness; for periodic sound it tracks the repetition rate of the waveform in Hz (cycles per second).
Human hearing range: roughly 20 Hz to 20,000 Hz; the top end falls with age (often ~15-16 kHz by middle age).
Temporal theory: auditory nerve fibres PHASE-LOCK, firing at a consistent point on each wave cycle, so spike timing encodes the period.
Phase-locking is reliable below ~1,000-1,500 Hz, degrades up to ~4,000-5,000 Hz, and is essentially gone above ~5 kHz.
A single nerve fibre maxes out at ~200-500 spikes/sec (refractory period ~1 ms), too slow to fire once per cycle at high pitch.
Volley principle (Wever, 1949): groups of fibres take turns firing on different cycles; combined volleys reconstruct rates of a few kHz.
Period T (seconds) = 1 / frequency (Hz). 100 Hz -> 10 ms between spikes; 1,000 Hz -> 1 ms.
Missing fundamental: brain hears the low pitch from harmonic spacing even with no fundamental, e.g. 200+300+400 Hz heard as 100 Hz.
Place theory (Helmholtz/Bekesy) maps frequency to a spot on the basilar membrane: high freqs at the base, low freqs at the apex.
Duplex theory: TEMPORAL coding dominates below ~1-2 kHz, PLACE coding above ~4-5 kHz; doubling frequency = up one octave (A4 440 -> A5 880 Hz).
How it works
A periodic sound enters the ear and vibrates the basilar membrane.
Hair cells convert that vibration into electrical spikes in auditory nerve fibres.
Each fibre phase-locks, firing at the same phase of every wave cycle (or every few cycles).
The gap between spikes equals the wave's period (1 / frequency).
Many fibres fire in volleys, taking turns so the combined timing represents fast rates.
The brain measures that inter-spike timing pattern and reads it out as the pitch.
Real examples
A 100 Hz bass note: nerves fire every 10 ms, an easy job for temporal coding to lock onto.
Telephone/cheap speaker with no real bass still lets you hear a low voice pitch via the missing-fundamental effect.
A male voice fundamental ~100-120 Hz sits squarely in the phase-locking zone, so pitch is rock solid.
A 5 kHz cymbal shimmer is too fast to phase-lock, so the ear leans on place coding instead.
Tuning a bass guitar by ear works because low-frequency pitch perception (temporal) is extremely precise.
How it helps in live sound
Low end (sub/kick 30-120 Hz) is judged by timing, so tight phase alignment between subs and tops matters more than raw level for clean pitch.
Use delay/alignment (measure with Smaart or a phase trace) so a 60 Hz kick and the top box arrive in phase, or the bass note smears.
Beware comb filtering: two arrivals a few ms apart muddy low-frequency pitch; check arrival times, not just EQ.
Muddy bass guitar is often a fundamental masked by room buildup; cut 200-400 Hz rather than boosting the (often missing) fundamental.
Small PA with no sub still conveys bass-line pitch via missing fundamental, so don't over-boost lows hunting for the note.
Content above ~5 kHz is placed by tone colour, not pitch timing, so brightness/air EQ won't fix a wrong-sounding bass note.
Everyday analogy
It's like clapping in time with a drummer: the speed of your claps mirrors the beat, and from that rhythm of taps you instantly know how fast the music is going.
Watch out
Myth: pitch is purely about which spot on the cochlea lights up (place theory). Correction: for low frequencies the brain mainly reads the TIMING of nerve firings (temporal coding), which is why place theory alone fails below ~1 kHz.
Fun fact
You can hear a bass note that physically isn't there: play only the 200, 300 and 400 Hz harmonics and your brain manufactures a 100 Hz pitch from their spacing, the 'missing fundamental'.
Key takeaways
Temporal theory: pitch = repetition rate of the wave, encoded by spike timing.
Phase-locking works below ~1 kHz, fades by ~5 kHz; place theory takes over up high.
Volley principle lets teams of nerves fire in turns to track rates above one fibre's limit.
Period T = 1 / frequency; 100 Hz = 10 ms between spikes.
Missing fundamental proves the brain reads timing/spacing, not just one tone.
In live sound, low-end pitch clarity depends on phase/time alignment, not just EQ.