Reading Statistical Claims

Develop critical thinking skills to evaluate statistical claims: spotting manipulation, understanding study design, relative vs absolute risk, and asking the right questions.

22 min read
Intermediate

Statistical Literacy as Self-Defense

Every day you're bombarded with statistical claims:

  • "Studies show..."
  • "Experts say..."
  • "X is linked to Y..."
  • "9 out of 10 doctors recommend..."

Most people accept these claims uncritically. Statistical literacy is your defense against manipulation, misinformation, and your own biases.

This lesson teaches you to read between the lines and ask the right questions.

The Anatomy of a Statistical Claim

Every statistical claim has hidden structure. Learn to spot:

1. The claim itself
What exactly are they saying?

2. The source
Who conducted the study? Who funded it?

3. The sample
How many? How selected? Representative?

4. The design
Observational or experimental? Controls? Randomization?

5. The effect size
How big is the difference? Meaningful?

6. The uncertainty
Confidence intervals? p-values? Margin of error?

7. What's NOT said
What are they hiding? Alternative explanations?

Red Flags in Headlines

Suspicious Phrases
Phrase
What It Might Mean
Questions to Ask
"Studies show..."Cherry-picked or low-quality studiesWhich studies? Sample size? Replicated?
"X is linked to Y"Correlation, not causationObservational data? Confounders?
"Increases risk by 50%"Relative risk hiding small absolute riskWhat's the baseline risk?
"Up to X..."Best-case scenario, not typicalWhat's the average? Range?
"Statistically significant"p < 0.05, might be meaninglessEffect size? Practical significance?
"More research needed"Results inconclusive or small effectWhy inconclusive? Underpowered?

Relative vs Absolute Risk

This is the #1 way media misleads with statistics.

The Classic Manipulation

Headline: "New drug reduces heart attack risk by 50%!"

Reality:

  • Control group: 2 in 1,000 had heart attacks (0.2%)
  • Treatment group: 1 in 1,000 had heart attacks (0.1%)

Relative risk reduction: 50% (1 is 50% less than 2) ✓
Absolute risk reduction: 0.1 percentage points (0.1% difference)

Number needed to treat (NNT): 1,000 people must take the drug to prevent 1 heart attack.

Cost: 500/person/year×1,000people=500/person/year × 1,000 people = 500,000 to prevent 1 heart attack

Suddenly that "50% reduction" doesn't sound as impressive!

Always convert relative risk to absolute risk. Relative risk exaggerates small effects. Absolute risk shows real-world impact.

Sample Size Matters

Survey Says...

Claim: "80% of Americans support this policy (±3% margin of error)"

Sounds legitimate! But:

Question 1: How many people surveyed?
If n = 1,000, ±3% is correct.
If n = 50, the margin should be ±14% — completely misleading!

Question 2: How were they selected?
Random digit dialing? Online poll (self-selected)?
Phone survey during work hours (misses workers)?

Question 3: How was the question worded?
"Should we reduce wasteful government spending?" → 90% yes
"Should we cut Social Security?" → 20% yes
Same policy, different framing!

Critical questions:

  • What's the sample size?
  • Random sample or convenience sample?
  • What population does it represent?
  • What's the response rate? (50% response → who didn't respond?)

Correlation vs Causation (Again)

"X is linked to Y" almost always means correlation, not causation.

Media loves to imply causation:

  • "Coffee drinkers live longer" → Coffee causes longevity?
  • "Video games linked to aggression" → Video games cause violence?
Unpacking a Correlation

Claim: "People who eat chocolate regularly are thinner than non-chocolate-eaters."

Possible explanations:

  1. Chocolate causes weight loss (unlikely)
  2. Reverse causation: Thin people eat chocolate guilt-free; overweight people avoid it
  3. Confounding: Chocolate eaters are wealthier (dark chocolate is expensive), and wealth correlates with better health
  4. Measurement error: Self-reported chocolate consumption inaccurate

Without randomized controlled trials, you can't distinguish these!

Look for: Was this an experiment (randomized)? Or observational (just watching)? Only experiments establish causation.

Study Design Hierarchy

Strength of Evidence
Design
Strength
Causal Inference
Example
Randomized Controlled TrialStrongestYesDrug trials
Cohort StudyStrongMaybeFollow smokers and non-smokers
Case-Control StudyModerateWeakCompare cancer patients to controls
Cross-Sectional SurveyWeakNoOne-time survey snapshot
Case Report / AnecdoteWeakestNo"My aunt tried this and..."

When evaluating claims:

  • RCTs are gold standard (if done well)
  • Observational studies suggest relationships, don't prove causation
  • Anecdotes are worthless for inference (but useful for generating hypotheses)

Meta-analyses and systematic reviews of multiple RCTs are strongest evidence.

Who Funded the Study?

Follow the money. Conflicts of interest bias results.

Industry-Funded Research

Sugar industry studies (1960s-1970s):

  • Industry-funded: Sugar not linked to heart disease; fat is the villain
  • Independent research (decades later): Sugar strongly linked to obesity, diabetes, heart disease

Pharmaceutical company trials:

  • Company-funded trials: 85% show benefit
  • Independent trials: 50% show benefit

Tobacco industry:

  • Funded studies for decades claiming cigarettes weren't harmful
  • Created "doubt" despite overwhelming independent evidence

This doesn't mean all industry-funded research is wrong. But it requires extra scrutiny.

Key questions:

  • Who funded the study?
  • Do authors have financial interests?
  • Are results published in peer-reviewed journals?
  • Have independent researchers replicated findings?

Publication Bias

The file drawer problem: Positive results get published. Negative results sit in file drawers.

This creates a false impression:

  • 20 studies test a hypothesis
  • 1 finds a significant effect (p < 0.05) — gets published
  • 19 find nothing — never published

You only see the 1 positive study! You conclude the effect is real, but it might just be the 5% false positive.

Solutions (slowly being adopted):

  • Pre-registration: Declare hypothesis and analysis plan before collecting data
  • Registered reports: Journals accept/reject before seeing results
  • Open data: Share data for independent verification
  • Require reporting of all outcomes (primary and secondary)

Questions to Always Ask

1. What's the sample size?
Small samples → unreliable. Need n > 30 minimum, often much more.

2. How was the sample selected?
Random? Convenience? Self-selected? Representative of what population?

3. Is this correlation or causation?
Observational study → correlation only. RCT → can infer causation.

4. What's the effect size?
"Significant" doesn't mean "large." Is the difference practically meaningful?

5. What's the baseline?
Relative risk of "50% increase" from what? Absolute risk matters more.

6. What's the margin of error?
Every estimate has uncertainty. Is it ±1% or ±20%?

7. Who funded this?
Conflicts of interest bias results.

8. Has it been replicated?
One study proves nothing. Replication by independent researchers matters.

9. What are alternative explanations?
Confounding? Reverse causation? Selection bias?

10. What's not being reported?
Negative results? Adverse effects? Failed studies?

Developing Your BS Detector

Practice skepticism:

  • Default to "interesting, but..." not "I believe it"
  • Extraordinary claims require extraordinary evidence
  • Seek out the original study, not just the headline
  • Look for what contradicts your beliefs (confirmation bias)

Don't throw the baby out with the bathwater:

  • Statistics isn't broken — misuse is the problem
  • Good studies with proper methods are reliable
  • Science is self-correcting (slowly)

The goal: Be appropriately skeptical, not cynical. Demand evidence, but update beliefs when evidence is strong.

You've completed the statistics course! You now have the tools to:

  • Understand data and uncertainty
  • Spot manipulation and fallacies
  • Make evidence-based decisions
  • Think statistically in everyday life

Statistical literacy is a superpower in the information age. Use it wisely.

Test your knowledge

🧠 Knowledge Check
1 / 5

Headline: "Drug reduces disease risk by 50%!" Control: 2/1000 get disease. Treatment: 1/1000. What is the absolute risk reduction?