Type I and Type II Errors
Understanding testing errors and power
Type I and Type II Errors
The Four Possible Outcomes
| Decision Reality | H₀ True | H₀ False | |-------------------|---------|----------| | Fail to reject H₀ | ✓ Correct | Type II Error | | Reject H₀ | Type I Error | ✓ Correct |
Type I Error: Reject H₀ when it's actually true (false positive)
Type II Error: Fail to reject H₀ when it's actually false (false negative)
Type I Error (α)
Definition: Rejecting true null hypothesis
Probability: α (significance level)
Example: Medical test
- H₀: Patient healthy
- Type I: Diagnose disease when patient is healthy
Consequences: False alarm, unnecessary treatment, wasted resources
Control: Set α before testing (0.05, 0.01, etc.)
Type II Error (β)
Definition: Failing to reject false null hypothesis
Probability: β (depends on true parameter value, sample size, α)
Example: Medical test
- H₀: Patient healthy
- Type II: Miss disease in sick patient
Consequences: Miss real effect, fail to treat, potential harm
Control: Increase sample size, increase α (trade-off!)
Power
Power: Probability of correctly rejecting false H₀
Higher power = better test (more likely to detect real effect)
Factors increasing power:
- Larger sample size (n)
- Larger effect size (further from H₀)
- Less variability (smaller σ)
- Higher α (but increases Type I risk)
Example: Coin Testing
Test if coin is fair:
- H₀: p = 0.5 (fair)
- Hₐ: p ≠ 0.5 (biased)
- Flip 20 times, α = 0.05
Type I Error:
- Coin actually fair (p = 0.5)
- Get unusual result (like 15 heads)
- Reject H₀ (conclude biased)
- Error: Called fair coin biased
Type II Error:
- Coin actually biased (say p = 0.7)
- Get result that looks reasonable for fair coin (like 11 heads)
- Fail to reject H₀
- Error: Failed to detect biased coin
Calculating Type I Error Probability
Type I Error probability = α (by design)
Example: If α = 0.05, P(Type I Error) = 0.05
Interpretation: 5% of the time we reject H₀, H₀ is actually true
Calculating Power (Advanced)
Requires:
- Specific alternative value
- Sample size
- Variability
- α
Example: Test H₀: μ = 100 vs Hₐ: μ > 100
- α = 0.05, n = 25, σ = 15
- True μ = 106
Power calculation:
- Find critical value for rejection
- Find probability of exceeding it when μ = 106
- This is the power
Typically use software for exact power calculations
Trade-offs
Decreasing α (stricter):
- ↓ Type I Error risk
- ↑ Type II Error risk
- ↓ Power
Increasing α:
- ↑ Type I Error risk
- ↓ Type II Error risk
- ↑ Power
Can't minimize both simultaneously with fixed n!
Solution: Increase n (decreases both error types)
Choosing α
Common practice: α = 0.05
More conservative (α = 0.01): When Type I Error very costly
- Example: Approving new drug (don't want false positive)
Less conservative (α = 0.10): When Type II Error very costly
- Example: Screening test (don't want to miss cases)
Balance: Consider consequences of each error type
Real-World Examples
Criminal Trial:
- H₀: Defendant innocent
- Type I: Convict innocent person (false conviction)
- Type II: Acquit guilty person (false acquittal)
- System prioritizes avoiding Type I (innocent until proven guilty)
Medical Screening:
- H₀: Patient disease-free
- Type I: False positive (unnecessary worry, follow-up tests)
- Type II: False negative (miss disease, delayed treatment)
- Balance depends on disease severity
Quality Control:
- H₀: Process working properly
- Type I: Stop working process (wasted time, money)
- Type II: Miss defective process (bad products shipped)
Relationship Between Errors
For fixed n:
- Lowering α → higher β (inverse relationship)
- Can't have both low α and low β
Increasing n:
- Can lower both α and β
- Only way to improve both
Increasing effect size:
- β decreases (easier to detect large effects)
- α unchanged (still set by us)
Power Analysis for Sample Size
Before study: Determine n needed for desired power
Typical goal: Power = 0.80 (80% chance of detecting effect)
Requires specifying:
- Minimum important effect size
- Desired α
- Estimated variability
- Desired power
Software: G*Power, R, online calculators
Common Misconceptions
❌ "P-value is probability of Type I Error"
- No! α is P(Type I Error)
- P-value is P(data | H₀)
❌ "Can eliminate both error types"
- No! Trade-off exists (for fixed n)
❌ "Type II Error is 1 - α"
- No! That's only if specific alternative value is exactly on boundary
❌ "High power means H₀ is false"
- No! Power is property of test, not evidence about H₀
Practical Advice
Before study:
- Consider consequences of each error type
- Choose α appropriately
- Do power analysis to determine n
After study:
- Report P-value (not just "significant" or "not")
- Consider practical significance, not just statistical
- Recognize limitations (Type II error possible if fail to reject)
Quick Reference
Type I Error (α):
- Reject true H₀
- P(Type I) = α
- False positive
Type II Error (β):
- Fail to reject false H₀
- P(Type II) = β
- False negative
Power = 1 - β:
- Probability of detecting real effect
- Increase with: larger n, larger effect, smaller σ, larger α
Trade-off:
- Can't minimize both errors with fixed n
- Increase n to reduce both
Remember: All hypothesis tests risk errors. Understanding and balancing these risks is key to good statistical practice!
📚 Practice Problems
No example problems available yet.
Practice with Flashcards
Review key concepts with our flashcard system
Browse All Topics
Explore other calculus topics