Hypothesis Testing Framework

Null and alternative hypotheses, significance level

Hypothesis Testing Framework

What is Hypothesis Testing?

Hypothesis Test: Formal procedure to decide between two competing claims about a population parameter

Two hypotheses:

  • Null hypothesis (H₀): Status quo, no effect, no difference
  • Alternative hypothesis (Hₐ or H₁): What we're trying to show

Goal: Determine if data provides sufficient evidence to reject H₀ in favor of Hₐ

Setting Up Hypotheses

H₀: Always includes equality (=, ≤, ≥)

Hₐ: Can be:

  • Two-sided: μ ≠ μ₀ (different from)
  • Right-sided: μ > μ₀ (greater than)
  • Left-sided: μ < μ₀ (less than)

Examples:

Claim: Mean height > 68 inches

  • H₀: μ = 68 or μ ≤ 68
  • Hₐ: μ > 68

Claim: Proportion ≠ 0.5

  • H₀: p = 0.5
  • Hₐ: p ≠ 0.5

The Four-Step Process

Step 1: STATE

  • Parameter of interest
  • Hypotheses (H₀ and Hₐ)
  • Significance level α

Step 2: PLAN

  • Choose appropriate test
  • Check conditions

Step 3: DO

  • Calculate test statistic
  • Find P-value

Step 4: CONCLUDE

  • Compare P-value to α
  • State conclusion in context

Test Statistic

General form:

Test statistic=statisticparameterstandard error\text{Test statistic} = \frac{\text{statistic} - \text{parameter}}{\text{standard error}}

For means (t-test):

t=xˉμ0s/nt = \frac{\bar{x} - \mu_0}{s/\sqrt{n}}

For proportions (z-test):

z=p^p0p0(1p0)nz = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}}

Measures: How many standard errors the statistic is from hypothesized parameter

P-Value

P-value: Probability of getting results as extreme or more extreme than observed, assuming H₀ is true

Interpretation:

  • Small P-value → data inconsistent with H₀ → evidence against H₀
  • Large P-value → data consistent with H₀ → insufficient evidence against H₀

Finding P-value:

  • Two-sided: P(|test statistic| ≥ observed)
  • Right-sided: P(test statistic ≥ observed)
  • Left-sided: P(test statistic ≤ observed)

Significance Level (α)

α: Threshold for rejecting H₀

Common values: 0.05, 0.01, 0.10

Decision rule:

  • If P-value ≤ α → Reject H₀
  • If P-value > α → Fail to reject H₀

Note: "Fail to reject" ≠ "accept" H₀ (lack of evidence against ≠ evidence for)

Example: Complete Test

Claim: Mean score exceeds 75. Sample: n = 30, xˉ\bar{x} = 78, s = 10

STATE:

  • Parameter: μ = true mean score
  • H₀: μ = 75
  • Hₐ: μ > 75
  • α = 0.05

PLAN:

  • One-sample t-test
  • Conditions: Random ✓, n = 30 ≥ 30 ✓, n < 10%N ✓

DO: t=787510/30=31.8261.64t = \frac{78 - 75}{10/\sqrt{30}} = \frac{3}{1.826} \approx 1.64

df = 29, P-value ≈ 0.056 (from tcdf)

CONCLUDE: P-value = 0.056 > 0.05, fail to reject H₀. Insufficient evidence that mean exceeds 75.

One-Sided vs Two-Sided Tests

Two-sided: Looking for any difference

  • Hₐ: μ ≠ μ₀
  • P-value = 2 × P(|t| ≥ observed)

One-sided: Looking for specific direction

  • Hₐ: μ > μ₀ or μ < μ₀
  • P-value = P(t ≥ observed) or P(t ≤ observed)

Choose before seeing data! One-sided only if direction specified in advance

Statistical Significance

Statistically significant: P-value ≤ α

Interpretation: Result unlikely to occur by chance alone if H₀ true

NOT the same as practically significant!

  • Can have statistically significant but tiny effect
  • Large sample can detect trivial differences

Relationship to Confidence Intervals

For two-sided test at α = 0.05:

Equivalent to checking if (1-α) CI contains H₀ value

  • If μ₀ in 95% CI → P-value > 0.05
  • If μ₀ not in 95% CI → P-value ≤ 0.05

CI gives more information: Range of plausible values, not just yes/no

Common Misconceptions

❌ "P-value is probability H₀ is true"

  • No! It's P(data | H₀), not P(H₀ | data)

❌ "Fail to reject H₀ means H₀ is true"

  • No! Just insufficient evidence against it

❌ "Significant means important"

  • No! Statistically significant ≠ practically important

❌ "P-value is probability of error"

  • No! That's α (if we reject H₀)

Writing Conclusions

✓ Good: "We have sufficient evidence to conclude the mean exceeds 75."

✓ Good: "There is insufficient evidence that the proportion differs from 0.5."

✗ Bad: "We prove the mean is 75."

✗ Bad: "We accept H₀."

✗ Bad: "The probability H₀ is true is 0.056."

Quick Reference

Hypotheses:

  • H₀: includes =
  • Hₐ: what we're testing for

Test statistic: (statistic - parameter) / SE

P-value: P(as extreme | H₀ true)

Decision:

  • P ≤ α: Reject H₀
  • P > α: Fail to reject H₀

Remember: Hypothesis testing is about evidence, not proof. Small P-value = strong evidence against H₀, but never proves Hₐ!

📚 Practice Problems

No example problems available yet.