The Pigeonhole Principle
Master the pigeonhole principle for existence proofs, worst-case guarantees, and surprising mathematical results.
Introduction
The Pigeonhole Principle is deceptively simple: if you have more pigeons than pigeonholes, at least one hole must contain multiple pigeons. Yet this principle leads to profound existence proofs in mathematics, computer science, and beyond.
Learning Objectives:
- Master the basic and generalized Pigeonhole Principle
- Apply it to existence proofs and impossibility results
- Understand the Birthday Paradox through pigeonhole reasoning
- Solve real-world problems using pigeonhole arguments
The Basic Pigeonhole Principle
If objects are placed into containers (where ), then at least one container must contain more than one object.
Mathematical statement: If , then .
Proof: By contradiction. If every container had at most 1 object, we'd have at most objects total, contradicting .
Example 1: Socks in a Drawer
You have 10 pairs of socks (20 socks total) in 5 different colors (4 socks of each color). In a dark room, how many socks must you grab to guarantee a matching pair?
Solution:
- Pigeons: socks grabbed
- Holes: colors (5)
- To guarantee a match, we need at least one color to have 2 socks
If we grab 6 socks, by the Pigeonhole Principle, at least one color must have socks.
Answer: 6 socks guarantees a matching pair.
import random
from collections import Counter
def simulate_sock_draw(n_colors=5, socks_per_color=4, draws=1000):
"""Simulate drawing socks until a pair is found"""
results = []
for _ in range(draws):
# Create drawer with socks
drawer = []
for color in range(n_colors):
drawer.extend([color] * socks_per_color)
# Draw socks one by one until a pair
random.shuffle(drawer)
drawn = []
for i, sock in enumerate(drawer, 1):
drawn.append(sock)
if Counter(drawn).most_common(1)[0][1] >= 2:
results.append(i)
break
return results
# Run simulation
results = simulate_sock_draw()
print(f"Simulating sock drawing ({len(results)} trials)")
print(f"\nSocks drawn until first pair:")
print(f" Minimum: {min(results)}")
print(f" Maximum: {max(results)}")
print(f" Average: {sum(results)/len(results):.2f}")
# Count distribution
counter = Counter(results)
print(f"\nDistribution:")
for socks, count in sorted(counter.items()):
pct = count / len(results) * 100
print(f" {socks} socks: {count:3d} times ({pct:5.1f}%)")
print(f"\nPigeonhole guarantee: 6 socks (5 colors + 1)")
print(f"Trials reaching 6: {sum(1 for r in results if r == 6):3d} ({sum(1 for r in results if r == 6)/len(results)*100:.1f}%)")Key Insight: The Pigeonhole Principle gives a worst-case guarantee. You might get lucky (matching pair in 2 draws), but you're guaranteed a match in 6. This is crucial for algorithm analysis and worst-case bounds.
The Generalized Pigeonhole Principle
If objects are placed into containers, then at least one container contains at least objects.
Proof: Suppose each container has at most objects. Then total objects would be at most:
This contradicts having objects.
Example 2: Class Birthdays
In a class of 400 students, prove that at least two students share the same birthday (ignoring leap years).
Solution:
- Pigeons: 400 students
- Holes: 365 possible birthdays
- By generalized pigeonhole:
At least one day has students born on it.
import math
def min_objects_for_guarantee(n_containers, min_per_container):
"""Calculate minimum objects to guarantee min_per_container in at least one container"""
# We need n > (min_per_container - 1) * n_containers
return (min_per_container - 1) * n_containers + 1
def max_objects_per_container(n_objects, n_containers):
"""Calculate guaranteed maximum objects in fullest container"""
return math.ceil(n_objects / n_containers)
# Birthday problem
students = 400
days = 365
guarantee = max_objects_per_container(students, days)
print(f"With {students} students and {days} possible birthdays:")
print(f" At least one day has β₯ {guarantee} students")
print(f" Calculation: β{students}/{days}β = β{students/days:.3f}β = {guarantee}")
# Other examples
print(f"\nMore examples:")
examples = [
(100, 12, "100 people, 12 months"),
(53, 52, "53 cards, 52 types"),
(27, 26, "27 students, 26 letters"),
(1000, 365, "1000 people, 365 days")
]
for objects, containers, desc in examples:
result = max_objects_per_container(objects, containers)
print(f" {desc}: at least {result} in one container")Birthday Paradox Preview
The Birthday Paradox asks a different question: In a room of people, what's the probability that at least two share a birthday?
Surprising result: With just 23 people, the probability exceeds 50%!
This is not the same as the pigeonhole principle (which guarantees a match at 366 people), but pigeonhole reasoning helps us understand why collisions happen so quickly.
import math
def birthday_probability(n_people, n_days=365):
"""Calculate probability that at least 2 people share a birthday"""
if n_people > n_days:
return 1.0 # Pigeonhole guarantee
# Probability all different = (365/365) * (364/365) * ... * ((365-n+1)/365)
prob_all_different = 1.0
for i in range(n_people):
prob_all_different *= (n_days - i) / n_days
# Probability at least one match = 1 - P(all different)
return 1 - prob_all_different
# Find when probability exceeds 50%
print("Birthday Paradox: Probability of shared birthday\n")
print(f"{'People':>7} {'Probability':>12} {'Visual':>20}")
print("-" * 45)
for n in [5, 10, 20, 23, 30, 40, 50, 60, 100, 366]:
prob = birthday_probability(n)
bar = "β" * int(prob * 20)
print(f"{n:>7} {prob:>11.1%} {bar:<20}")
print(f"\nKey insight: Only 23 people needed for >50% chance!")
print(f"But pigeonhole guarantees match at 366 people (worst case)")Common Confusion: The pigeonhole principle gives guarantees (worst case), while probability gives expected behavior (average case). With 23 people, you're likely to see a match, but it's not guaranteed. At 366, it's guaranteed.
Applications and Existence Proofs
Example 3: Substring Existence
In any string of 11 binary digits (0s and 1s), prove that some substring of length 4 must repeat.
Solution:
- Total 4-digit substrings: positions 1-4, 2-5, 3-6, ..., 8-11 = 8 substrings
- Possible 4-digit patterns: patterns
Wait, we have fewer substrings (8) than patterns (16), so pigeonhole doesn't immediately apply!
Better approach: Consider a string of length 11. There are possible strings. But if all 8 substrings were distinct, we'd only use 8 of 16 possible patterns. However, the argument needs refinement.
Correct approach: Actually, in a string of length , some -substring must repeat. For : . So in a string of length 20, some 4-substring repeats.
Let me recalculate: In length 11, we have 8 substrings and 16 patterns, so no guarantee. Let's use a cleaner example.
Example 4: Sequence Monotonicity
In any sequence of distinct numbers, there exists either:
- An increasing subsequence of length , OR
- A decreasing subsequence of length
This is the ErdΕsβSzekeres theorem, provable by pigeonhole reasoning!
def longest_increasing_subsequence(seq):
"""Find longest increasing subsequence length (dynamic programming)"""
if not seq:
return 0
n = len(seq)
dp = [1] * n
for i in range(1, n):
for j in range(i):
if seq[j] < seq[i]:
dp[i] = max(dp[i], dp[j] + 1)
return max(dp)
def longest_decreasing_subsequence(seq):
"""Find longest decreasing subsequence length"""
return longest_increasing_subsequence([-x for x in seq])
# Test ErdΕsβSzekeres theorem
import random
n = 5 # We expect increasing or decreasing subsequence of length 6
seq_length = n**2 + 1 # 26 numbers
print(f"ErdΕsβSzekeres theorem: n = {n}, sequence length = {seq_length}")
print(f"Guarantee: increasing OR decreasing subsequence of length β₯ {n+1}\n")
# Generate random sequence
random.seed(42)
seq = random.sample(range(1, 200), seq_length)
lis = longest_increasing_subsequence(seq)
lds = longest_decreasing_subsequence(seq)
print(f"Random sequence (first 15): {seq[:15]}...")
print(f"\nLongest increasing subsequence: {lis}")
print(f"Longest decreasing subsequence: {lds}")
print(f"\nMax of both: {max(lis, lds)} (guaranteed β₯ {n+1})")
print(f"Theorem satisfied: {max(lis, lds) >= n+1}")Key Takeaways
- Basic Pigeonhole: objects in holes () β at least one hole has objects
- Generalized: At least one hole has objects
- Existence proofs: Powerful for showing something must exist without finding it
- Worst-case reasoning: Provides guarantees, not probabilities
- Birthday paradox: Pigeonhole guarantees at 366, but probability high at 23
Next Lesson: Advanced counting techniques, symmetry arguments, and problem-solving strategies!
Problem | Pigeons | Holes | Guarantee |
|---|---|---|---|
| Socks | 6 socks | 5 colors | 2 matching |
| Birthdays | 400 people | 365 days | 2 same day |
| Cards | 53 cards | 52 ranks | 2 same rank |
| Hair count | 8 billion people | 200k hairs | 40k+ same count |