Loadingโฆ
Analyze paired data using the paired t-test and matched pairs designs.
Learn step-by-step with practice exercises built right in.
Data are paired when observations are linked:
Key insight: Pairing reduces variability, improving power of test.
When to use: Testing whether mean difference equals zero
Test statistic:
A researcher wants to test if a new study technique improves test scores. She records the scores of 10 students before and after using the technique. Why should she use a paired t-test rather than a two-sample t-test?
She should use a paired t-test because the same students are measured twice (before and after), creating natural pairs. This violates the independence assumption required for two-sample t-tests. The paired design is more powerful because it controls for individual student differences in baseline ability.
Key considerations: โข Each student serves as their own control โข Focus is on the difference within each pair โข Reduces variability by eliminating between-student differences
Avoid these 3 frequent errors
Review key concepts with our flashcard system
Explore more AP Statistics topics
Where:
Worked Example: A coach tests 12 runners' 100m sprint times before and after training.
| Runner | Before | After | Difference () |
|---|---|---|---|
| 1 | 12.4 | 12.1 | -0.3 |
| 2 | 11.8 | 11.5 | -0.3 |
| ... | ... | ... | ... |
sec, sec
; (two-tailed, )
Since , reject : Training significantly improves sprint time.
Do NOT confuse:
For paired data, we always work with the mean of differences.
โ Using two-sample t-test on paired data (loses power) โ Computing instead of โ Not pairing the data when possible โ Forgetting that = number of pairs, not total observations
State "paired t-test" not just "t-test." Define clearly: "Let = time after minus time before." Show one or two differences calculated.
Ten married couples were asked to rate their happiness on a scale from 1 to 10. The differences (husband - wife) in ratings were: 2, -1, 0, 3, -2, 1, 0, 2, -1, 1. Construct a 95% confidence interval for the mean difference in happiness ratings.
Step 1: Calculate statistics from differences dฬ = (2 + (-1) + 0 + 3 + (-2) + 1 + 0 + 2 + (-1) + 1) / 10 = 0.5
Step 2: Calculate standard deviation sd = โ[ฮฃ(di - dฬ)ยฒ / (n-1)] = โ[14.5 / 9] โ 1.27
Step 3: Find t* for df = 9, 95% confidence t* = 2.262
Step 4: Calculate confidence interval CI = dฬ ยฑ t*(sd/โn) CI = 0.5 ยฑ 2.262(1.27/โ10) CI = 0.5 ยฑ 0.91 CI = (-0.41, 1.41)
Conclusion: We are 95% confident that the true mean difference in happiness ratings (husband - wife) is between -0.41 and 1.41 points.
A coach wants to know if a new training program improves 100m sprint times. He records the times of 8 runners before and after the program. The mean difference (before - after) is 0.3 seconds with a standard deviation of 0.4 seconds. Test at ฮฑ = 0.05 if the program improves times.
Hโ: ฮผd = 0 (no improvement) Hโ: ฮผd > 0 (improvement, before > after)
Test statistic: t = (dฬ - 0) / (sd/โn) t = (0.3 - 0) / (0.4/โ8) t = 0.3 / 0.141 t โ 2.12
df = n - 1 = 7
P-value (one-tailed): P(t > 2.12) โ 0.036
Decision: Since p-value (0.036) < ฮฑ (0.05), reject Hโ
Conclusion: There is sufficient evidence at the 5% significance level to conclude that the training program improves 100m sprint times.
A pharmaceutical company tests a new medication on 15 patients with high blood pressure. Each patient's blood pressure is measured before treatment and after 3 months. The differences (before - after) have a mean of 8 mmHg and standard deviation of 6 mmHg. Can we conclude at ฮฑ = 0.01 that the medication lowers blood pressure?
Hโ: ฮผd = 0 (no change) Hโ: ฮผd > 0 (blood pressure decreases)
Test statistic: t = (dฬ - 0) / (sd/โn) t = (8 - 0) / (6/โ15) t = 8 / 1.549 t โ 5.16
df = 14
P-value (one-tailed): P(t > 5.16) < 0.0001
Decision: Since p-value < 0.01, reject Hโ
Conclusion: There is very strong evidence (p < 0.01) that the medication lowers blood pressure. The large t-statistic (5.16) indicates the effect is both statistically significant and likely clinically meaningful.
A nutritionist studies whether eating breakfast affects students' performance on a math test. She has 20 students take a test after skipping breakfast and another test after eating breakfast (order randomized). Why is this a paired design? What are the advantages and potential concerns?
Why it's paired: Each student takes both tests (no breakfast and with breakfast), creating natural pairs. We analyze the difference in scores for each student.
Advantages: โข Controls for individual differences in math ability โข More powerful than independent samples design โข Requires fewer subjects (20 vs 40 for independent groups) โข Each student serves as their own control
Potential concerns:
Practice effect: Students might do better on the second test regardless of breakfast Solution: Randomize which condition comes first
Carryover effect: Effects from first test might influence second test Solution: Sufficient time between tests
Different test difficulty: If tests aren't equivalent, this confounds results Solution: Use equivalent forms or counterbalance test versions
Learning between tests: Students might study between tests Solution: Control time between tests, avoid giving feedback