Confidence Intervals for Means

Estimating population means using t-distributions

Confidence Intervals for Means

Why t-Distribution?

Problem: Population σ usually unknown

Solution: Use sample standard deviation s, but this adds uncertainty

Result: Use t-distribution instead of normal

t-distribution:

  • Similar to normal (symmetric, bell-shaped)
  • Heavier tails (accounts for extra uncertainty from using s)
  • Depends on degrees of freedom (df = n - 1)
  • As df increases, approaches normal

One-Sample t-Interval for Mean

Formula:

xˉ±tsn\bar{x} \pm t^* \frac{s}{\sqrt{n}}

Where:

  • xˉ\bar{x} = sample mean
  • s = sample standard deviation
  • n = sample size
  • t* = critical value from t-distribution with df = n - 1

Conditions for t-Interval

Random: Random sample
Normal: Population approximately normal OR n ≥ 30 (CLT)
Independent: n < 10% of population (if sampling without replacement)

For normality:

  • If n < 15: Data must be very close to normal (check with plot)
  • If 15 ≤ n < 30: Data should be roughly symmetric, no outliers
  • If n ≥ 30: Can proceed unless severe outliers or extreme skew

Finding t* Critical Value

Calculator: invT(area to left, df)

Example: 95% CI with n = 20 (df = 19)

  • Area to left = (1 + 0.95)/2 = 0.975
  • invT(0.975, 19) ≈ 2.093

Table: Look up df and confidence level

Example 1: Simple t-Interval

Test scores: n = 25, xˉ\bar{x} = 78, s = 12

95% CI:

Conditions:

  • Random: Assume ✓
  • Normal: n = 25, assume roughly normal ✓
  • Independent: 25 < 10% of students ✓

Calculate:

  • df = 25 - 1 = 24
  • t* = 2.064 (from table/calculator)
  • SE = 12/√25 = 2.4

CI=78±2.064(2.4)=78±4.95CI = 78 \pm 2.064(2.4) = 78 \pm 4.95

(73.05,82.95)(73.05, 82.95)

Interpretation: We are 95% confident the true mean score is between 73.05 and 82.95.

t vs z

Use z when:

  • Known population σ (rare!)
  • Working with proportions

Use t when:

  • Unknown σ, using sample s (almost always for means!)

Key difference: t has heavier tails → wider intervals (more conservative)

Sample Size for Desired ME

Challenge: ME depends on s, which we don't know in advance

Approach:

  1. Estimate s from pilot study or similar data
  2. Use conservative t* (larger than final value)
  3. Calculate n
  4. Round up

Formula:

n=(tsm)2n = \left(\frac{t^* s}{m}\right)^2

Example 2: Checking Normality

Small sample (n = 12):

  • MUST check for approximate normality
  • Use dotplot, boxplot, or normal probability plot
  • Look for: symmetric shape, no outliers, no severe skew

If data skewed or has outliers with small n: t-procedures NOT appropriate

Two-Sample t-Interval

Comparing two means:

(xˉ1xˉ2)±ts12n1+s22n2(\bar{x}_1 - \bar{x}_2) \pm t^* \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}

df: Use calculator (complex formula) or conservative: min(n₁-1, n₂-1)

Conditions: Both samples meet conditions

Interpretation: If interval contains 0, no significant difference

Paired Data

When data naturally paired:

  • Before/after on same subjects
  • Twins, matched pairs

Analyze differences:

  1. Calculate difference for each pair: d = x₁ - x₂
  2. One-sample t-interval on differences

dˉ±tsdn\bar{d} \pm t^* \frac{s_d}{\sqrt{n}}

Where n = number of pairs, df = n - 1

Example 3: Paired Data

Blood pressure before/after medication (n = 15 patients):

  • dˉ\bar{d} = 8.2 (average decrease)
  • s_d = 5.1

90% CI for mean decrease:

  • df = 14
  • t* = 1.761
  • SE = 5.1/√15 ≈ 1.317

CI=8.2±1.761(1.317)=8.2±2.32CI = 8.2 \pm 1.761(1.317) = 8.2 \pm 2.32

(5.88,10.52)(5.88, 10.52)

Interpretation: We are 90% confident medication reduces blood pressure by 5.88 to 10.52 points on average.

Interpreting Confidence Level

Same as for proportions:

95% means if we repeated sampling many times, about 95% of intervals would contain true μ

NOT: "95% of data in interval" or "95% chance μ in interval"

Effect of Sample Size

Larger n:

  • Smaller SE (dividing by √n)
  • More df → smaller t* (approaches z*)
  • Result: Narrower CI (more precise)

Trade-off: Cost and time of collecting larger sample

Robustness of t-Procedures

t-procedures fairly robust to violations of normality if:

  • n reasonably large (≥ 30)
  • No extreme outliers

Less robust for:

  • Small samples with skewness
  • Extreme outliers (affect both xˉ\bar{x} and s)

Calculator Commands (TI-83/84)

STAT → TESTS → 8:TInterval

Enter:

  • Data or Stats
  • If Stats: xˉ\bar{x}, s, n
  • C-Level
  • Calculate

For two-sample: 0:2-SampTInt

Common Mistakes

❌ Using z* instead of t*
❌ Using t* from wrong df
❌ Not checking normality with small samples
❌ Confusing paired with two-sample
❌ Misinterpreting confidence level

Quick Reference

Formula: xˉ±tsn\bar{x} \pm t^* \frac{s}{\sqrt{n}} with df = n - 1

Conditions: Random, approximately normal (or n ≥ 30), independent

Use t (not z) when σ unknown, using s

Paired data: Analyze differences with one-sample t

Remember: t-distribution accounts for extra uncertainty from estimating σ with s. Always check conditions, especially normality for small samples!

📚 Practice Problems

No example problems available yet.