Common Statistical Tests
Master the Z-test, one-sample t-test, two-sample t-test, paired t-test, and chi-square test β with worked examples for each.
A Toolkit of Statistical Tests
Now that you understand the hypothesis testing framework, let's learn the most commonly used tests. Each test is designed for a specific type of question and data structure.
Z-Test (Large Sample, Ο Known)
The simplest test. Used when you know the population standard deviation (rare in practice, but foundational).
A factory claims light bulbs last ΞΌβ = 1000 hours. You test 64 bulbs: xΜ = 980, Ο = 80 (known).
Hβ: ΞΌ = 1000, Hβ: ΞΌ β 1000, Ξ± = 0.05
z = (980 - 1000) / (80/β64) = -20 / 10 = -2.0
p-value = 2 Γ P(Z < -2.0) = 2 Γ 0.0228 = 0.0456 < 0.05
Reject Hβ. Evidence suggests bulbs don't last 1000 hours.
One-Sample t-Test
The workhorse of hypothesis testing. Used when Ο is unknown (almost always) and you want to test if a population mean equals a specific value.
A nutritionist claims a diet results in weight loss of ΞΌβ = 5 kg. You measure 25 patients: xΜ = 4.2 kg, s = 2.1 kg.
Hβ: ΞΌ = 5, Hβ: ΞΌ β 5, Ξ± = 0.05
t = (4.2 - 5) / (2.1/β25) = -0.8 / 0.42 = -1.90
With df = 24, the critical t-values are Β±2.064. |-1.90| < 2.064, so p-value > 0.05.
Fail to reject Hβ. Insufficient evidence that the true weight loss differs from 5 kg.
Two-Sample t-Test (Independent Samples)
Compares the means of two independent groups. "Is there a difference between Group A and Group B?"
Does a new study method improve test scores?
Control group: nβ = 30, xΜβ = 72, sβ = 10 New method: nβ = 30, xΜβ = 78, sβ = 12
Hβ: ΞΌβ = ΞΌβ, Hβ: ΞΌβ β ΞΌβ
SE = β(100/30 + 144/30) = β(3.33 + 4.80) = β8.13 = 2.85
t = (72 - 78) / 2.85 = -2.11
With df β 56, this gives p β 0.040 < 0.05.
Reject Hβ. Evidence suggests the new method produces different (higher) scores.
Assumptions: Both groups are independent, approximately normally distributed (or n is large enough for CLT), and ideally have similar variances. Welch's t-test (default in most software) relaxes the equal variance assumption.
Paired t-Test
Used when the two measurements are connected β typically before/after measurements on the same subjects.
Blood pressure before and after medication for 10 patients. Differences (before - after): 5, 8, 3, 12, 7, -2, 9, 6, 4, 8
dΜ = 6.0, s_d = 3.77, n = 10
Hβ: ΞΌ_d = 0 (no change), Hβ: ΞΌ_d > 0 (blood pressure decreases)
t = 6.0 / (3.77/β10) = 6.0 / 1.19 = 5.04
With df = 9, p < 0.001. Reject Hβ. Strong evidence the medication reduces blood pressure.
Why paired is better: By computing differences, each subject serves as their own control. This eliminates person-to-person variability, making it easier to detect the treatment effect.
Chi-Square Test of Independence
For categorical data: tests whether two categorical variables are related.
Where O = observed count and E = expected count under independence.
Is there a relationship between gender and preference for tea vs coffee?
| | Tea | Coffee | Total | |---|---|---|---| | Male | 30 | 70 | 100 | | Female | 50 | 50 | 100 | | Total | 80 | 120 | 200 |
Under independence, expected count = (row total Γ col total) / grand total. E(Male, Tea) = 100 Γ 80 / 200 = 40
ΟΒ² = (30-40)Β²/40 + (70-60)Β²/60 + (50-40)Β²/40 + (50-60)Β²/60 = 2.5 + 1.67 + 2.5 + 1.67 = 8.33
With df = (2-1)(2-1) = 1, the critical value at Ξ± = 0.05 is 3.841. 8.33 > 3.841 β Reject Hβ. Gender and drink preference are related.
Choosing the Right Test
Question | Data Type | Test |
|---|---|---|
| Is the mean equal to a value? | One continuous, Ο known | Z-test |
| Is the mean equal to a value? | One continuous, Ο unknown | One-sample t-test |
| Do two groups have different means? | Two independent groups | Two-sample t-test |
| Is there a before/after change? | Paired measurements | Paired t-test |
| Are two categories related? | Two categorical variables | Chi-square test |
Decision tree: Start with your question, then identify your data type. The question + data type almost always uniquely determine the right test. If in doubt, most software will guide you.