5. Information Theory (The Deep Root) · Concept 2 of 6
Entropy
It is a measure of how surprising or unpredictable a sound or message is, which tells you how much real information is packed in it.
Same tone repeats so it carries ~0 bits; a chaotic live mix surprises every sample, so its entropy meter pins high and it needs more data.
What it is
Entropy measures how unpredictable a signal is — more surprise means more real information, so more bits needed to store or send it.
Key facts
Shannon entropy formula: H = -Σ p(x) log₂ p(x), where H = average bits per symbol, p(x) = probability of each possible value x, Σ = sum over all values, log₂ = log base 2.
Unit = bits per symbol (log base 2). 1 bit = the info in one fair coin flip (50/50).
Claude Shannon invented this in his 1948 paper 'A Mathematical Theory of Communication' — the birth of information theory.
Max entropy = total unpredictability: a fair coin = 1.0 bit; a flatline tone ≈ 0 bits (fully predictable).
Source coding theorem: you CANNOT losslessly compress below H bits per symbol on average — entropy is the hard floor.
CD audio = 16 bits × 44,100 samples/sec × 2 channels = 1,411,200 bits/sec (1.41 Mbit/s) raw PCM.
MP3 320 kbit/s and AAC 256 kbit/s are LOSSY — they bin detail your ear can't easily hear to beat the entropy floor; FLAC is lossless and shrinks music ~40-60%.
Pure sine tone → entropy per sample near 0; white noise = maximum entropy (every sample a surprise, ~0 redundancy, barely compresses).
Redundancy = 1 − (H / H_max). High redundancy = compressible. Nyquist–Shannon: sample rate must be ≥ 2× the highest frequency (44.1 kHz captures up to ~22.05 kHz).
How it works
List every possible outcome of the signal and its probability p(x).
Surprise of one outcome = log₂(1/p) bits — rare events carry more bits.
Weight each surprise by how often it happens (multiply by p).
Add them all up: that average is the entropy H, in bits per symbol.
High H = noisy/complex/unpredictable; low H = repetitive/predictable.
Compression aims at H; you can't go under it without losing data.
Real examples
Flatline 1 kHz test tone: predictable, ~0 bits/sample, compresses to almost nothing.
Solo acoustic guitar: moderate entropy, FLAC squeezes it well.
Full live band + crowd + reverb: high entropy, needs high bitrate or detail is lost.
White-noise hiss / rain: near-max entropy, barely compressible — the file stays huge.
A muted mic (silence): basically zero entropy, codecs store it in a few bytes.
How it helps in live sound
Record live multitrack in 24-bit/48 kHz WAV (lossless) — high-entropy gigs need the headroom; don't trust lossy.
For streaming a busy show, push bitrate up: AAC 256-320 kbit/s, not 128, or transients and cymbals smear.
Dante/AES67 networks carry uncompressed PCM (~1.2 Mbit/s per channel at 24/48) — size your switch and bandwidth for it.
Gate or mute dead channels: silence is ~0 entropy and saves recorder space and CPU.
If a stream chokes, the encoder is below the signal's entropy — raise bitrate or simplify the mix, don't just blame WiFi.
Noise (hiss, hum, crowd) is high-entropy and eats bitrate for nothing — clean the source so bits go to music.
Everyday analogy
It is like packing a suitcase: a tone is the same sock 100 times so it crushes flat, but a live band is 100 different shaped items so it needs a much bigger bag.
Watch out
Myth: 'a louder or fuller-sounding signal has more entropy.' Wrong — entropy is about UNPREDICTABILITY, not loudness; a deafening steady tone has almost zero entropy.
Fun fact
White noise, the most chaotic sound, has the HIGHEST entropy of all — so it is mathematically the hardest audio to compress, which is why a hiss file can be bigger than a song.
Key takeaways
Entropy = average surprise = bits of real information per symbol.
Formula: H = -Σ p(x) log₂ p(x), measured in bits.
Predictable tone ≈ 0 bits; chaotic noise = maximum bits.
You can't losslessly compress below H — it's the hard floor.
Complex live mixes need higher bitrate or detail gets binned.
Shannon, 1948 — the deep root of all digital audio.