Hypothesis Tests for Means

One-Sample t-Test

Test: Does sample provide evidence that population mean differs from claimed value?

Hypotheses:

H₀: μ = μ₀
Hₐ: μ ≠ μ₀ (or μ > μ₀ or μ < μ₀)

Conditions:

Random sample
Population approximately normal OR n ≥ 30 (CLT)
n < 10% of population

Test Statistic:

$t = \frac{\bar{x} - \mu_0}{s/\sqrt{n}}$

df = n - 1

P-Value for t-Test

Use t-distribution with df = n - 1

Two-sided: P(|t| ≥ observed)
Right-sided: P(t ≥ observed)
Left-sided: P(t ≤ observed)

Calculator: tcdf

Example 1: One-Sample t-Test

Company claims mean wait time is 5 minutes. Sample: n = 25, $\bar{x}$ = 5.8, s = 1.5. Test at α = 0.05.

STATE:

μ = true mean wait time
H₀: μ = 5
Hₐ: μ ≠ 5
α = 0.05

PLAN:

One-sample t-test
Random: Assume ✓
Normal: n = 25, assume roughly normal ✓
Independent: 25 < 10% of all customers ✓

DO:

$t = \frac{5.8 - 5}{1.5/\sqrt{25}} = \frac{0.8}{0.3} \approx 2.67$

df = 24

P-value = 2 × P(t ≥ 2.67) ≈ 2(0.0067) ≈ 0.013

CONCLUDE: P-value = 0.013 < 0.05, reject H₀. Sufficient evidence mean wait time differs from 5 minutes.

Two-Sample t-Test

Compare two independent groups:

Hypotheses:

H₀: μ₁ = μ₂ (or μ₁ - μ₂ = 0)
Hₐ: μ₁ ≠ μ₂ (or μ₁ > μ₂ or μ₁ < μ₂)

Test Statistic:

$t = \frac{(\bar{x}_1 - \bar{x}_2) - 0}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}$

df: Use calculator (Welch's approximation) or conservative min(n₁-1, n₂-1)

Note: Do NOT pool (unlike proportions)

Conditions for Two-Sample t-Test

Both groups:

Random/independent samples
Each approximately normal OR both n ≥ 30
Each n < 10% of population

Example 2: Two-Sample t-Test

Compare new vs old teaching method:

New: n₁ = 30, $\bar{x}_1$ = 85, s₁ = 8
Old: n₂ = 28, $\bar{x}_2$ = 80, s₂ = 10

STATE:

μ₁ = mean score with new method
μ₂ = mean score with old method
H₀: μ₁ = μ₂
Hₐ: μ₁ > μ₂
α = 0.05

PLAN:

Two-sample t-test
Conditions: Both n ≥ 30, random, independent ✓

DO:

$t = \frac{85 - 80}{\sqrt{\frac{64}{30} + \frac{100}{28}}} = \frac{5}{\sqrt{2.13 + 3.57}} = \frac{5}{2.39} \approx 2.09$

df ≈ 50 (calculator gives exact)

P-value = P(t ≥ 2.09) ≈ 0.021

CONCLUDE: P-value = 0.021 < 0.05, reject H₀. Sufficient evidence new method produces higher scores.

t vs z

Use t-test when:

Population σ unknown (almost always!)
Using sample s

Use z-test when:

Population σ known (rare)
Proportions (different formula)

For large n: t ≈ z (distributions nearly identical)

Checking Normality

Small samples (n < 15):

Data must be close to normal
Check with dotplot, boxplot, normal probability plot
No outliers, roughly symmetric

Medium samples (15 ≤ n < 30):

Can tolerate slight skew
No extreme outliers

Large samples (n ≥ 30):

CLT applies
Can proceed unless severe outliers/skew

Robustness

t-procedures fairly robust to normality if:

n reasonably large
No extreme outliers
Not severely skewed

Less robust with:

Very small n
Extreme outliers (affect $\bar{x}$ and s)

One-Sided vs Two-Sided

Choose before seeing data!

Two-sided: Looking for any difference
One-sided: Specific direction predicted

One-sided has more power (for that direction) but:

Can't detect effect in other direction
Generally less conservative

Calculator Commands (TI-83/84)

One-sample: STAT → TESTS → 2:T-Test

μ₀, $\bar{x}$ , s, n, direction
Calculate

Two-sample: STAT → TESTS → 4:2-SampTTest

$\bar{x}_1$ , s₁, n₁, $\bar{x}_2$ , s₂, n₂
Pooled: No
Calculate

Relationship to CI

For two-sided test at α:

Equivalent: (1-α) CI contains μ₀?

If yes → fail to reject
If no → reject

CI more informative: Shows range of plausible values

Common Mistakes

❌ Using z when should use t
❌ Pooling variances in two-sample t-test
❌ Not checking normality with small samples
❌ Confusing one-sample with paired
❌ Using wrong df

Practical Significance

Statistical significance ≠ practical importance

Example: Large sample (n = 10,000) finds mean = 100.2 vs claimed 100

Might be statistically significant
But is 0.2 difference practically important?

Always consider:

Effect size (magnitude of difference)
Context (what matters in practice)
Cost/benefit

Quick Reference

One-sample: $t = \frac{\bar{x} - \mu_0}{s/\sqrt{n}}$ , df = n - 1

Two-sample: $t = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}$

Conditions: Random, approximately normal (or n ≥ 30), independent

Use t (not z) when σ unknown

Remember: t-tests are workhorses of statistics. Check conditions, especially normality for small samples. Use calculator for exact P-values and df!

Tests for Means