Confidence Intervals for Means
Estimating population means using t-distributions
Confidence Intervals for Means
Why t-Distribution?
Problem: Population σ usually unknown
Solution: Use sample standard deviation s, but this adds uncertainty
Result: Use t-distribution instead of normal
t-distribution:
- Similar to normal (symmetric, bell-shaped)
- Heavier tails (accounts for extra uncertainty from using s)
- Depends on degrees of freedom (df = n - 1)
- As df increases, approaches normal
One-Sample t-Interval for Mean
Formula:
Where:
- = sample mean
- s = sample standard deviation
- n = sample size
- t* = critical value from t-distribution with df = n - 1
Conditions for t-Interval
Random: Random sample
Normal: Population approximately normal OR n ≥ 30 (CLT)
Independent: n < 10% of population (if sampling without replacement)
For normality:
- If n < 15: Data must be very close to normal (check with plot)
- If 15 ≤ n < 30: Data should be roughly symmetric, no outliers
- If n ≥ 30: Can proceed unless severe outliers or extreme skew
Finding t* Critical Value
Calculator: invT(area to left, df)
Example: 95% CI with n = 20 (df = 19)
- Area to left = (1 + 0.95)/2 = 0.975
- invT(0.975, 19) ≈ 2.093
Table: Look up df and confidence level
Example 1: Simple t-Interval
Test scores: n = 25, = 78, s = 12
95% CI:
Conditions:
- Random: Assume ✓
- Normal: n = 25, assume roughly normal ✓
- Independent: 25 < 10% of students ✓
Calculate:
- df = 25 - 1 = 24
- t* = 2.064 (from table/calculator)
- SE = 12/√25 = 2.4
Interpretation: We are 95% confident the true mean score is between 73.05 and 82.95.
t vs z
Use z when:
- Known population σ (rare!)
- Working with proportions
Use t when:
- Unknown σ, using sample s (almost always for means!)
Key difference: t has heavier tails → wider intervals (more conservative)
Sample Size for Desired ME
Challenge: ME depends on s, which we don't know in advance
Approach:
- Estimate s from pilot study or similar data
- Use conservative t* (larger than final value)
- Calculate n
- Round up
Formula:
Example 2: Checking Normality
Small sample (n = 12):
- MUST check for approximate normality
- Use dotplot, boxplot, or normal probability plot
- Look for: symmetric shape, no outliers, no severe skew
If data skewed or has outliers with small n: t-procedures NOT appropriate
Two-Sample t-Interval
Comparing two means:
df: Use calculator (complex formula) or conservative: min(n₁-1, n₂-1)
Conditions: Both samples meet conditions
Interpretation: If interval contains 0, no significant difference
Paired Data
When data naturally paired:
- Before/after on same subjects
- Twins, matched pairs
Analyze differences:
- Calculate difference for each pair: d = x₁ - x₂
- One-sample t-interval on differences
Where n = number of pairs, df = n - 1
Example 3: Paired Data
Blood pressure before/after medication (n = 15 patients):
- = 8.2 (average decrease)
- s_d = 5.1
90% CI for mean decrease:
- df = 14
- t* = 1.761
- SE = 5.1/√15 ≈ 1.317
Interpretation: We are 90% confident medication reduces blood pressure by 5.88 to 10.52 points on average.
Interpreting Confidence Level
Same as for proportions:
95% means if we repeated sampling many times, about 95% of intervals would contain true μ
NOT: "95% of data in interval" or "95% chance μ in interval"
Effect of Sample Size
Larger n:
- Smaller SE (dividing by √n)
- More df → smaller t* (approaches z*)
- Result: Narrower CI (more precise)
Trade-off: Cost and time of collecting larger sample
Robustness of t-Procedures
t-procedures fairly robust to violations of normality if:
- n reasonably large (≥ 30)
- No extreme outliers
Less robust for:
- Small samples with skewness
- Extreme outliers (affect both and s)
Calculator Commands (TI-83/84)
STAT → TESTS → 8:TInterval
Enter:
- Data or Stats
- If Stats: , s, n
- C-Level
- Calculate
For two-sample: 0:2-SampTInt
Common Mistakes
❌ Using z* instead of t*
❌ Using t* from wrong df
❌ Not checking normality with small samples
❌ Confusing paired with two-sample
❌ Misinterpreting confidence level
Quick Reference
Formula: with df = n - 1
Conditions: Random, approximately normal (or n ≥ 30), independent
Use t (not z) when σ unknown, using s
Paired data: Analyze differences with one-sample t
Remember: t-distribution accounts for extra uncertainty from estimating σ with s. Always check conditions, especially normality for small samples!
📚 Practice Problems
1Problem 1easy
❓ Question:
A random sample of 25 students has a mean study time of 18.5 hours per week with a standard deviation of 4.2 hours. Construct a 95% confidence interval for the mean study time. Assume the population is approximately normal.
💡 Show Solution
Step 1: Identify given information n = 25 x̄ = 18.5 hours s = 4.2 hours Confidence level = 95%
Step 2: Check conditions RANDOM: Random sample ✓ NORMAL: Population approximately normal (given) ✓
- Since n = 25 < 30, need this assumption INDEPENDENT: Assume n ≤ 0.10N ✓
Step 3: Use t-distribution (not z) We use t because:
- σ is unknown (only have s)
- Even though normality assumed
Degrees of freedom: df = n - 1 = 24
Step 4: Find t* critical value From t-table with df = 24, 95% confidence: t* = 2.064
Step 5: Calculate standard error SE = s/√n = 4.2/√25 = 4.2/5 = 0.84
Step 6: Calculate margin of error ME = t* × SE = 2.064 × 0.84 ≈ 1.73
Step 7: Construct confidence interval CI = x̄ ± ME = 18.5 ± 1.73 = (16.77, 20.23) ≈ (16.8, 20.2)
Step 8: Interpret We are 95% confident that the true mean study time for all students is between 16.8 and 20.2 hours per week.
Answer: 95% CI: (16.8, 20.2) hours
We use the t-distribution because the population standard deviation is unknown.
2Problem 2easy
❓ Question:
Why do we use the t-distribution instead of the z-distribution for confidence intervals for means?
💡 Show Solution
Step 1: Understand the key difference Z-distribution: Used when σ (population SD) is KNOWN T-distribution: Used when σ is UNKNOWN, use s (sample SD)
Step 2: Why σ is usually unknown In practice:
- Rarely know true population standard deviation
- If we knew σ, we'd probably know μ too!
- Almost always must estimate from sample
Step 3: What using s instead of σ does Using s adds extra variability:
- s varies from sample to sample
- s is random, σ is fixed
- More uncertainty → wider intervals
Step 4: T-distribution accounts for this T-distribution has:
- Heavier tails than normal
- More probability in extremes
- Depends on sample size (df = n-1)
This compensates for extra uncertainty from estimating σ
Step 5: Compare z and t For 95% confidence:
- z* = 1.96 (always)
- t* depends on df:
- df = 5: t* = 2.571 (much larger!)
- df = 10: t* = 2.228
- df = 20: t* = 2.086
- df = 30: t* = 2.042
- df = ∞: t* → 1.96 (approaches z)
Step 6: As n increases Small n:
- s is unreliable estimate of σ
- Need large t* for extra safety
- Wide intervals
Large n:
- s becomes good estimate of σ
- t* approaches z*
- T-distribution → Normal
Step 7: When to use which? USE Z:
- σ known (rare!)
- Large sample (n ≥ 30) and any distribution
- Proportions (different formula)
USE T:
- σ unknown (almost always!)
- Small sample and population approximately normal
- Means with sample SD
Answer: Use t-distribution when σ is unknown and we must estimate it with s. The t-distribution has heavier tails to account for the extra uncertainty from estimating σ. As sample size increases, t approaches the normal distribution.
3Problem 3medium
❓ Question:
A researcher measures reaction times (in seconds) for 40 subjects: x̄ = 0.38s, s = 0.12s. Construct a 99% confidence interval for the mean reaction time.
💡 Show Solution
Step 1: Given information n = 40 x̄ = 0.38 seconds s = 0.12 seconds Confidence level = 99%
Step 2: Check conditions RANDOM: Assume random sample ✓ NORMAL: n = 40 ≥ 30, can apply CLT ✓ INDEPENDENT: Assume 40 ≤ 0.10N ✓
Step 3: Find t* critical value df = n - 1 = 39 99% confidence
From t-table: t* ≈ 2.708 (or use calculator/software)
Step 4: Calculate SE SE = s/√n = 0.12/√40 = 0.12/6.325 ≈ 0.0190
Step 5: Calculate ME ME = t* × SE = 2.708 × 0.0190 ≈ 0.0514
Step 6: Construct CI CI = 0.38 ± 0.051 = (0.329, 0.431) ≈ (0.33, 0.43) seconds
Step 7: Interpret We are 99% confident that the true mean reaction time is between 0.33 and 0.43 seconds.
Answer: 99% CI: (0.33, 0.43) seconds
4Problem 4medium
❓ Question:
Compare 90%, 95%, and 99% confidence intervals for the same data: n = 36, x̄ = 50, s = 12. What happens to interval width as confidence level increases?
💡 Show Solution
Step 1: Set up n = 36, x̄ = 50, s = 12 df = 35
Step 2: Calculate SE (same for all) SE = s/√n = 12/√36 = 12/6 = 2
Step 3: Find t* values 90% CI: t* ≈ 1.690 95% CI: t* ≈ 2.030 99% CI: t* ≈ 2.724
Step 4: Calculate MEs ME₉₀ = 1.690 × 2 = 3.38 ME₉₅ = 2.030 × 2 = 4.06 ME₉₉ = 2.724 × 2 = 5.45
Step 5: Construct intervals 90% CI: 50 ± 3.38 = (46.62, 53.38), width = 6.76 95% CI: 50 ± 4.06 = (45.94, 54.06), width = 8.12 99% CI: 50 ± 5.45 = (44.55, 55.45), width = 10.90
Step 6: Compare widths 90% → 95%: width increases by 20% 95% → 99%: width increases by 34% 90% → 99%: width increases by 61%
Higher confidence = wider interval!
Step 7: The tradeoff Higher confidence level:
- More confident interval captures μ
- Less precise (wider interval)
Lower confidence level:
- More precise (narrower interval)
- Less confident interval captures μ
Cannot have both high confidence AND high precision!
Answer: 90% CI: (46.6, 53.4) 95% CI: (45.9, 54.1) 99% CI: (44.6, 55.4)
As confidence increases from 90% to 99%, interval width increases by 61%. This is the precision-confidence tradeoff.
5Problem 5hard
❓ Question:
A 95% CI for mean weight is (150, 170) lbs based on n = 25. If we want to cut the margin of error in half with the same confidence level, what sample size is needed?
💡 Show Solution
Step 1: Find current ME CI = (150, 170) Width = 170 - 150 = 20 ME = width/2 = 10 lbs
Step 2: Want new ME ME_new = 10/2 = 5 lbs
Step 3: Understand ME formula ME = t* × (s/√n)
For same confidence and approximately same s: ME ∝ 1/√n
Step 4: Set up proportion ME₁/ME₂ = √(n₂/n₁)
10/5 = √(n₂/25) 2 = √(n₂/25) 4 = n₂/25 n₂ = 100
Step 5: Why quadruple? To halve ME, must quadruple n:
- ME ∝ 1/√n
- Half the ME → √n must double
- If √n doubles, n must quadruple
General rule:
- To reduce ME by factor k → multiply n by k²
- To halve ME (k=2) → multiply n by 4
- To third ME (k=3) → multiply n by 9
Step 6: Verify Original: ME = t*/√25 = t*/5 New: ME = t*/√100 = t*/10
Ratio: (t*/5)/(t*/10) = 10/5 = 2 ✓
ME is indeed halved!
Answer: n = 100
Need to quadruple the sample size from 25 to 100 to halve the margin of error. This is because ME ∝ 1/√n, so halving ME requires quadrupling n.
Practice with Flashcards
Review key concepts with our flashcard system
Browse All Topics
Explore other calculus topics