Type I and Type II Errors

The Four Possible Outcomes

| Decision Reality | H₀ True | H₀ False | |-------------------|---------|----------| | Fail to reject H₀ | ✓ Correct | Type II Error | | Reject H₀ | Type I Error | ✓ Correct |

Type I Error: Reject H₀ when it's actually true (false positive)

Type II Error: Fail to reject H₀ when it's actually false (false negative)

Type I Error (α)

Definition: Rejecting true null hypothesis

Probability: α (significance level)

Example: Medical test

H₀: Patient healthy
Type I: Diagnose disease when patient is healthy

Consequences: False alarm, unnecessary treatment, wasted resources

Control: Set α before testing (0.05, 0.01, etc.)

Type II Error (β)

Definition: Failing to reject false null hypothesis

Probability: β (depends on true parameter value, sample size, α)

Example: Medical test

H₀: Patient healthy
Type II: Miss disease in sick patient

Consequences: Miss real effect, fail to treat, potential harm

Control: Increase sample size, increase α (trade-off!)

Power

Power: Probability of correctly rejecting false H₀

$\text{Power} = 1 - \beta$

Higher power = better test (more likely to detect real effect)

Factors increasing power:

Larger sample size (n)
Larger effect size (further from H₀)
Less variability (smaller σ)
Higher α (but increases Type I risk)

Example: Coin Testing

Test if coin is fair:

H₀: p = 0.5 (fair)
Hₐ: p ≠ 0.5 (biased)
Flip 20 times, α = 0.05

Type I Error:

Coin actually fair (p = 0.5)
Get unusual result (like 15 heads)
Reject H₀ (conclude biased)
Error: Called fair coin biased

Type II Error:

Coin actually biased (say p = 0.7)
Get result that looks reasonable for fair coin (like 11 heads)
Fail to reject H₀
Error: Failed to detect biased coin

Calculating Type I Error Probability

Type I Error probability = α (by design)

Example: If α = 0.05, P(Type I Error) = 0.05

Interpretation: 5% of the time we reject H₀, H₀ is actually true

Calculating Power (Advanced)

Requires:

Specific alternative value
Sample size
Variability
α

Example: Test H₀: μ = 100 vs Hₐ: μ > 100

α = 0.05, n = 25, σ = 15
True μ = 106

Power calculation:

Find critical value for rejection
Find probability of exceeding it when μ = 106
This is the power

Typically use software for exact power calculations

Trade-offs

Decreasing α (stricter):

↓ Type I Error risk
↑ Type II Error risk
↓ Power

Increasing α:

↑ Type I Error risk
↓ Type II Error risk
↑ Power

Can't minimize both simultaneously with fixed n!

Solution: Increase n (decreases both error types)

Choosing α

Common practice: α = 0.05

More conservative (α = 0.01): When Type I Error very costly

Example: Approving new drug (don't want false positive)

Less conservative (α = 0.10): When Type II Error very costly

Example: Screening test (don't want to miss cases)

Balance: Consider consequences of each error type

Real-World Examples

Criminal Trial:

H₀: Defendant innocent
Type I: Convict innocent person (false conviction)
Type II: Acquit guilty person (false acquittal)
System prioritizes avoiding Type I (innocent until proven guilty)

Medical Screening:

H₀: Patient disease-free
Type I: False positive (unnecessary worry, follow-up tests)
Type II: False negative (miss disease, delayed treatment)
Balance depends on disease severity

Quality Control:

H₀: Process working properly
Type I: Stop working process (wasted time, money)
Type II: Miss defective process (bad products shipped)

Relationship Between Errors

For fixed n:

Lowering α → higher β (inverse relationship)
Can't have both low α and low β

Increasing n:

Can lower both α and β
Only way to improve both

Increasing effect size:

β decreases (easier to detect large effects)
α unchanged (still set by us)

Power Analysis for Sample Size

Before study: Determine n needed for desired power

Typical goal: Power = 0.80 (80% chance of detecting effect)

Requires specifying:

Minimum important effect size
Desired α
Estimated variability
Desired power

Software: G*Power, R, online calculators

Common Misconceptions

❌ "P-value is probability of Type I Error"

No! α is P(Type I Error)
P-value is P(data | H₀)

❌ "Can eliminate both error types"

No! Trade-off exists (for fixed n)

❌ "Type II Error is 1 - α"

No! That's only if specific alternative value is exactly on boundary

❌ "High power means H₀ is false"

No! Power is property of test, not evidence about H₀

Practical Advice

Before study:

Consider consequences of each error type
Choose α appropriately
Do power analysis to determine n

After study:

Report P-value (not just "significant" or "not")
Consider practical significance, not just statistical
Recognize limitations (Type II error possible if fail to reject)

Quick Reference

Type I Error (α):

Reject true H₀
P(Type I) = α
False positive

Type II Error (β):

Fail to reject false H₀
P(Type II) = β
False negative

Power = 1 - β:

Probability of detecting real effect
Increase with: larger n, larger effect, smaller σ, larger α

Trade-off:

Can't minimize both errors with fixed n
Increase n to reduce both

Remember: All hypothesis tests risk errors. Understanding and balancing these risks is key to good statistical practice!

Type I and Type II Errors

Type I and Type II Errors

The Four Possible Outcomes

Type I Error (α)

Type II Error (β)

Power

Example: Coin Testing

Calculating Type I Error Probability

Calculating Power (Advanced)

Trade-offs

Choosing α

Real-World Examples

Relationship Between Errors

Power Analysis for Sample Size

Common Misconceptions

Practical Advice

Quick Reference

📚 Practice Problems

Practice with Flashcards

Browse All Topics