Paired Data
Analyzing matched pairs
Paired Data and Matched Pairs
What is Paired Data?
Paired data: Two measurements on same subject or matched subjects
Examples:
- Before/after measurements on same people
- Twins (one gets treatment A, other gets treatment B)
- Matched subjects (similar age, gender, etc.)
- Same subjects under two conditions
Key: Natural pairing creates dependence
Why Pair?
Reduces variability by controlling for subject-to-subject differences
Example: Blood pressure
- People naturally have different BP
- Before/after on same person: eliminates person-to-person variation
- More sensitive to treatment effect
Pairing is powerful! Can detect smaller effects than two independent samples
Paired vs Two-Sample
Paired:
- Same subjects (or matched pairs)
- Analyze differences
- Use one-sample t-test on differences
Two-sample:
- Different subjects in each group
- Independent samples
- Use two-sample t-test
MUST identify which before analyzing!
Paired t-Test Procedure
1. Calculate differences: d = measurement₁ - measurement₂ for each pair
2. Hypotheses about mean difference:
- H₀: μ_d = 0 (no mean difference)
- Hₐ: μ_d ≠ 0 (or μ_d > 0 or μ_d < 0)
3. Use one-sample t-test on differences:
Where:
- = mean of differences
- s_d = standard deviation of differences
- n = number of pairs (not total observations!)
- df = n - 1
Conditions for Paired t-Test
- Random: Pairs randomly selected
- Normal: Differences approximately normal OR n ≥ 30
- Independent: Pairs independent of each other
Note: Measurements within pair are dependent (that's the point!), but pairs themselves must be independent
Example 1: Before/After
Blood pressure before and after medication (10 patients):
| Patient | Before | After | Difference (Before - After) | |---------|--------|-------|---------------------------| | 1 | 145 | 138 | 7 | | 2 | 152 | 145 | 7 | | ... | ... | ... | ... |
, s_d = 4.2, n = 10
STATE:
- μ_d = true mean reduction in BP
- H₀: μ_d = 0
- Hₐ: μ_d > 0
- α = 0.05
PLAN:
- Paired t-test
- Random: Assume ✓
- Normal: n = 10, check plot of differences (assume ok) ✓
- Independent: Patients independent ✓
DO:
df = 9
P-value = P(t ≥ 6.39) < 0.001
CONCLUDE: P-value < 0.05, reject H₀. Medication significantly reduces blood pressure.
Example 2: Matched Pairs
Twins study - Math scores (twin₁ gets tutoring, twin₂ doesn't):
n = 15 twin pairs
= 5.2 (tutored - control)
s_d = 6.8
Test if tutoring helps:
STATE:
- μ_d = true mean difference (tutored - control)
- H₀: μ_d = 0
- Hₐ: μ_d > 0
- α = 0.05
DO:
df = 14
P-value ≈ 0.005
CONCLUDE: Reject H₀. Significant evidence tutoring increases scores.
Direction of Differences
Consistent subtraction order matters!
Common choices:
- Before - After (positive means decrease)
- After - Before (positive means increase)
- Treatment - Control (positive means treatment better)
Be consistent and interpret accordingly!
Advantages of Pairing
1. Controls for confounding variables
- Each subject is own control
- Eliminates between-subject variation
2. Increases power
- Reduced variability → easier to detect effects
- Can use smaller sample size
3. More efficient
- Need fewer total subjects than two independent samples
When NOT to Pair
Don't pair if:
- No natural pairing exists
- Pairing is artificial or forced
- Want to generalize to unpaired populations
Pairing must be meaningful and appropriate!
Paired CI
Confidence interval for mean difference:
Interpretation: Range of plausible values for true mean difference
Example: Earlier BP study
90% CI:
We're 90% confident mean BP reduction is between 6.06 and 10.94 points.
Checking Normality of Differences
Important: Check normality of DIFFERENCES, not original data
Methods:
- Dotplot of differences
- Boxplot of differences
- Normal probability plot of differences
For small n: Must be close to normal
For large n (≥30): CLT applies to differences
Common Mistakes
❌ Using two-sample t-test on paired data (loses power!)
❌ Using paired test on independent samples
❌ Counting total observations instead of pairs for df
❌ Not checking normality of differences
❌ Inconsistent subtraction order
Identifying Paired Data
Ask yourself:
- Are there two measurements per subject?
- Is there natural pairing/matching?
- Would it make sense to calculate differences?
If yes → Paired data
If no → Independent samples
Calculator Commands (TI-83/84)
Method 1: Enter differences directly
- Calculate differences, enter in list
- STAT → TESTS → 2:T-Test
- Use difference list
Method 2: Use paired test
- Enter both measurements in separate lists
- STAT → TESTS → 2:T-Test
- Specify list₁ - list₂
Real-World Applications
Medical: Before/after treatment
Education: Pre-test/post-test
Psychology: Same subjects under different conditions
Agriculture: Adjacent plots (control for soil variation)
Marketing: Same consumers rating two products
Quick Reference
Key idea: Analyze differences, not separate groups
Test statistic: , df = n - 1
n = number of pairs (not total measurements)
Conditions: Random pairs, differences normal (or n ≥ 30), pairs independent
Remember: Pairing is powerful! Use it when available. Analyze differences with one-sample t-test. Don't use two-sample test on paired data!
📚 Practice Problems
No example problems available yet.
Practice with Flashcards
Review key concepts with our flashcard system
Browse All Topics
Explore other calculus topics