Hypothesis testing is a formal procedure for using sample data to decide between two competing claims about a population parameter.
Component
Symbol
Description
Null hypothesis
H0
No effect / no difference — the status quo
Alternative hypothesis
Ha
There IS an effect / difference — the research claim
🔑 Key Idea: We assume H0 is true and look for evidence against it. We NEVER prove H0 true — we either reject it or fail to reject it.
Writing Hypotheses
Hypotheses are always about population parameters (μ, p), never about sample statistics (xˉ, p^).
For means:
Type
H0
Ha
When to Use
Two-tailed
For proportions:
Type
H0
Ha
Two-tailed
⚠️ Important:H0 always contains the equals sign (=). The alternative contains =, , or .
Worked Example
Claim: "Students at this school score higher than the national average of 75."
Parameter: μ = true mean score of students at this school
H0:μ=75 (no difference from national average)
H (school average is higher)
This is a right-tailed test because the claim is "higher than."
Significance Level (α)
Before testing, we choose a significance level α (usually 0.05):
α
Meaning
0.05
Reject H0 if the evidence would occur less than 5% of the time under H0
🔑 AP Tip: Unless told otherwise, assume α=0.05 on the AP exam.
Hypothesis Setup 🎯
Hypothesis Identification 🧮
For each claim, identify the null value (μ0):
1) Claim: μ>75. H ?
Test Direction 🔍
Exit Quiz — Null & Alternative Hypotheses ✅
Part 2: Test Statistics
📊 Test Statistics
Part 2 of 7 — Measuring the Evidence
What Is a Test Statistic?
A test statistic measures how far the sample result falls from the null hypothesis value, expressed in standard-error units.
t=s/n
Part 3: P-Values
🔢 P-Values
Part 3 of 7 — How Surprising Is the Evidence?
What Is a P-Value?
The P-value is the probability of obtaining a test statistic as extreme as (or more extreme than) the one observed, assuming H0 is true.
🔑 Plain English: "If nothing special is happening (H0 is true), how likely is it that we'd see data this extreme just by chance?"
Part 4: Type I & Type II Errors
📈 Type I & Type II Errors
Part 4 of 7 — Making the Wrong Decision
Decision Table
Every hypothesis test has four possible outcomes:
H0 is actually true
H0 is actually
Part 5: One-Sample t-Test
🧮 One-Sample t-Test
Part 5 of 7 — The Complete Procedure
When to Use a One-Sample t-Test
Use a one-sample t-test when:
You have one quantitative variable
You want to test a claim about the population meanμ
The population standard deviation σ is unknown (use s instead)
Conditions (CHECK EVERY TIME)
Condition
What to Check
Part 6: Problem-Solving Workshop
🛠️ Problem-Solving Workshop
Part 6 of 7 — Putting It All Together
Worked Example 1: Cereal Box Weights
A cereal company advertises 16 oz boxes. A consumer group suspects the boxes are underfilled. They weigh a random sample of 40 boxes and find xˉ=15.7 oz, s=0.8 oz. Test at .
Part 7: Review & Applications
🏆 Review & Applications
Part 7 of 7 — Complete Reference Guide
Formula Reference
Formula
Expression
Purpose
Standard Error
SE=n
H0:μ=μ0
Ha:μ=μ0
"Is there a difference?"
Right-tailed
H0:μ=μ0
Ha:μ>μ0
"Is it greater than?"
Left-tailed
H0:μ=μ0
Ha:μ<μ0
"Is it less than?"
H0:p=p0
Ha:p=p0
Right-tailed
H0:p=p0
Ha:p>p0
Left-tailed
H0:p=p0
Ha:p<p0
>
<
a
:
μ>
75
0.01
More stringent — only reject with very strong evidence
0.10
Less stringent — reject with moderate evidence
0
:
μ=
2) Claim: μ<50. H0:μ= ?
3) Claim: μ=100. H0:μ= ?
xˉ−μ0
Symbol
Meaning
xˉ
Sample mean
μ0
Null hypothesis value
s
Sample standard deviation
n
Sample size
s/n
Standard error of xˉ
🔑 Interpretation:t tells you how many standard errors the sample mean is from the null value. Larger ∣t∣ → stronger evidence against H0.
Standard Error (SE)
The standard error measures the typical distance between xˉ and μ due to sampling variability:
SE=ns
Factor
Effect on SE
Larger s (more variability)
SE increases
Larger n (bigger sample)
SE decreases
🔑 Key Insight: Quadrupling the sample size halves the standard error (because 4=2).
Degrees of Freedom
For a one-sample t-test: df=n−1
The degrees of freedom determine which t-distribution to use for finding the P-value. More degrees of freedom → the t-distribution looks more like a normal distribution.
Worked Example
A school claims its average SAT math score is 500. A random sample of 36 students gives xˉ=520 and s=60.
Step 1 — Standard Error:
SE=ns=3660=660=10
Step 2 — Test Statistic:
t=SExˉ−μ0=10520−500=1020=2.0
Step 3 — Degrees of Freedom:
df=n−1=36−1=35
Interpretation: The sample mean is 2.0 standard errors above the null value. This is moderate-to-strong evidence against H0.
How Large Is "Large Enough"?
| ∣t∣ Value | Rough Guide |
|:-----------:|-------------|
| <1 | Weak evidence against H0 |
| 1 to 2 | Moderate evidence |
| >2 | Strong evidence |
| >3 | Very strong evidence |
⚠️ Caution: These are rough guidelines. Always compute the P-value for a precise conclusion.
Test Statistic Concepts 🎯
Computing Test Statistics 🧮
1)s=14, n=49. What is the standard error?
2)xˉ=82, μ0=75, SE=2. What is the t-statistic?
3)n=26. What are the degrees of freedom?
Interpreting Test Statistics 🔍
Exit Quiz — Test Statistics ✅
Decision Rule
Comparison
Decision
Conclusion
P<α
Reject H0
Result is statistically significant
P≥α
Fail to reject H0
Result is NOT statistically significant
⚠️ Never say "accept H0." We either reject or fail to reject.
Interpreting P-Values
P-value Range
Strength of Evidence Against H0
P>0.10
Weak or no evidence
0.05<P≤0.10
Moderate evidence
0.01<P≤0.05
Strong evidence
P≤0.01
Very strong evidence
One-Tailed vs Two-Tailed P-Values
Test Type
P-value Calculation
Right-tailed (Ha:μ>μ0)
P=P(t≥tobs)
Left-tailed (Ha:μ<μ0)
Two-tailed (Ha:μ=μ0)
$P = 2 \cdot P(t \geq
🔑 Two-tailed tests double the one-tail probability because evidence in either direction counts.
CONCLUDE:
Since P=0.0098<α=0.05, we reject H0. There is convincing evidence that the true mean battery life is less than 500 hours.
t-Test vs z-Test
Feature
z-Test
t-Test
σ known?
Yes
No (use s)
Distribution
Standard normal
t with df=n−1
AP Exam usage
Rare (proportions only)
Very common (means)
t-Test Concepts 🎯
Computing a t-Test 🧮
n=36, xˉ=52, s=6, μ0=50:
1)SE=s/n= ?
2)t=(xˉ−μ0)/SE= ?
3)df= ?
t-Test Procedure 🔍
Exit Quiz — One-Sample t-Test ✅
α
=
0.05
STATE:
μ = true mean weight of cereal boxes (oz)
H0:μ=16 vs Ha:μ<16 (left-tailed — suspect underfilling)
PLAN:
One-sample t-test
✅ Random: "random sample" stated
✅ Independent: 40 boxes <10% of all boxes produced
✅ Normal/Large: n=40≥30 → CLT applies
DO:SE=400.8=6.3250.8≈0.1265
t=0.126515.7−16=0.1265−0.3≈−2.372
df=40−1=39
P=tcdf(−1099,−2.372,39)≈0.0114
CONCLUDE:
Since P=0.0114<α=0.05, we reject H0. There is convincing evidence that the true mean weight of cereal boxes is less than 16 oz. The consumer group's suspicion is supported.
Worked Example 2: Study Hours
A college claims students study an average of 15 hours per week. A professor surveys a random sample of 50 students and finds xˉ=13.2 hours, s=5.1 hours. Test at α=0.05 whether the true mean differs from 15.
STATE:
μ = true mean weekly study hours for students at this college
H0:μ=15 vs Ha:μ=15 (two-tailed — "differs from")
PLAN:
One-sample t-test
✅ Random: "random sample" stated
✅ Independent: 50<10% of all college students
✅ Normal/Large: n=50≥30 → CLT applies
DO:SE=505.1=7.0715.1≈0.7212
t=0.721213.2−15=0.7212−1.8≈−2.495
df=49
P=2×tcdf(−1099,−2.495,49)≈2×0.0080=0.0160
CONCLUDE:
Since P=0.016<α=0.05, we reject H0. There is convincing evidence that the true mean weekly study hours for students at this college differs from 15 hours.
Common AP Mistakes to Avoid
Mistake
Correction
Not stating hypotheses in terms of μ
Always use population parameters
Skipping conditions
Must check all three explicitly
Saying "accept H0"
Say "fail to reject H0"
Conclusion without context
"There is (not) convincing evidence that [real-world statement]"
Using xˉ or p^ in hypotheses
Always use μ or p
Forgetting to double P for two-tailed
Two-tailed: P=2×one-tail area
Workshop Practice 🎯
Workshop Calculations 🧮
n=25, xˉ=84, s=10, μ0=80:
1)SE=s/n= ?
2)t=(xˉ−μ0)/SE= ?
3)df= ?
AP Process Steps 🔍
Exit Quiz — Problem-Solving Workshop ✅
s
Measures variability of xˉ
Test Statistic
t=s/nxˉ−μ0
Standardizes the distance from null
Degrees of Freedom
df=n−1
Determines the t-distribution shape
Hypothesis Test Decision Guide
Question
Answer
"Is it greater than?"
Right-tailed: Ha:μ>μ0
"Is it less than?"
Left-tailed: Ha:μ<μ0
"Is it different from?"
Two-tailed: Ha:μ=μ0
Decision Rule
If P<α⇒Reject H0If P≥α⇒Fail to reject H0
Error Summary
H0 True
H0 False
Reject H0
Type I (α)
✅ Correct (Power = 1−β)
Fail to reject H0
✅ Correct
Type II (β)
Conditions Checklist
Condition
Check
Random
SRS or randomized experiment
Independent
n<10% of population
Normal/Large
n≥30 (CLT) or data approximately normal
AP Four-Step Process
STATE — Define parameter; write H0 and Ha
PLAN — Name the test; check all three conditions
DO — Compute SE, t, df, P-value
CONCLUDE — Decision + evidence + context
Power Factors
To Increase Power
Do This
Increase n
More data → smaller SE → easier to detect effects
Increase α
More willing to reject (but more Type I risk)
Larger effect size
Bigger $
Decrease s
Less variability → more precise estimates
Common Mistakes on the AP Exam
Mistake
Why It's Wrong
"Accept H0"
We can only "fail to reject" — never prove H0
Using xˉ in hypotheses
Hypotheses use μ (parameter), not xˉ (statistic)
No context in conclusion
Must relate conclusion to the specific problem
Forgetting to check conditions
All three required for full credit
Confusing statistical and practical significance
Small P doesn't mean the effect matters in practice