Loading…
Apply the Central Limit Theorem to approximate sampling distributions as Normal.
Learn step-by-step with practice exercises built right in.
The Central Limit Theorem (CLT) is one of the most powerful theorems in statistics:
If samples of size n are drawn from ANY population distribution with mean μ and standard deviation σ, then:
State the Central Limit Theorem and explain what conditions must be met for it to apply.
Step 1: State the Central Limit Theorem For a random sample of size n from a population with mean μ and standard deviation σ:
As n increases, the sampling distribution of x̄ (sample mean) approaches a normal distribution with:
This happens REGARDLESS of the shape of the population distribution!
Step 2: Conditions that must be met
RANDOMNESS:
Avoid these 3 frequent errors
Review key concepts with our flashcard system
Explore more AP Statistics topics
In words:
As a practical guideline:
For large samples:
Visual Example: Even if the population is heavily skewed or multimodal, sample means concentrate near μ and form a symmetric, bell-shaped curve.
Scenario: Population has exponential distribution (right-skewed), mean = 5, SD = 5
n = 5 sample means:
n = 30 sample means:
n = 100 sample means:
Example: IQ scores have mean 100 and SD 15. You test 36 students. What's the probability their mean IQ exceeds 105?
CLT questions often ask you to verify conditions (n ≥ 30 or population approximately normal) and then make probability calculations. Always identify what you're finding: P(individual value), P(sample mean), or P(sample proportion). Each requires a different approach.
INDEPENDENCE:
SAMPLE SIZE:
Step 3: Why randomness matters Non-random samples:
Random selection ensures:
Step 4: Why independence matters Independence violated when:
Effects of dependence:
Step 5: Why sample size matters Small n:
Large n:
Step 6: Examples of CLT conditions check VALID: n = 50 from random digit table ✓ Random ✓ Independent (infinite population) ✓ n = 50 ≥ 30
INVALID: n = 100 from class of 200 students without replacement ✓ Could be random ✗ Not independent (100 > 0.10 × 200 = 20) ✓ n = 100 ≥ 30 Conclusion: 10% condition violated
VALID: n = 20 from normal population ✓ Random (assumed) ✓ Independent (assumed) ✓ Population is normal (works for any n)
Answer: CENTRAL LIMIT THEOREM: The sampling distribution of x̄ approaches Normal(μ, σ/√n) as n increases, regardless of population shape.
CONDITIONS:
All three conditions must be met to apply CLT and use normal probability calculations.
A population has a right-skewed distribution with μ = 25 and σ = 8. For samples of size n = 64, describe the sampling distribution of x̄ and calculate P(24 < x̄ < 26).
Step 1: Check CLT conditions Population: Right-skewed (not normal) Sample size: n = 64
Is n ≥ 30? Yes, 64 ≥ 30 ✓ By CLT: x̄ is approximately normally distributed
Step 2: Find parameters of sampling distribution Mean: μₓ̄ = μ = 25
Standard deviation (standard error): σₓ̄ = σ/√n = 8/√64 = 8/8 = 1
Step 3: Describe the sampling distribution x̄ ~ Normal(μ = 25, σ = 1) approximately
Key points:
Even though population is right-skewed, x̄ is approximately normal!
Step 4: Calculate P(24 < x̄ < 26) Standardize to z-scores:
z₁ = (24 - 25)/1 = -1/1 = -1 z₂ = (26 - 25)/1 = 1/1 = 1
P(24 < x̄ < 26) = P(-1 < Z < 1)
Step 5: Use empirical rule or table From empirical rule: About 68% of normal distribution is within 1 SD of mean
P(-1 < Z < 1) ≈ 0.68
More precisely from table: P(Z < 1) = 0.8413 P(Z < -1) = 0.1587 P(-1 < Z < 1) = 0.8413 - 0.1587 = 0.6826
Step 6: Interpret the result About 68.3% of all samples of size 64 will have sample means between 24 and 26.
This range is μ ± 1σₓ̄ = 25 ± 1 Very common for x̄ to fall in this range!
Step 7: Compare to individual values For individual value X from population:
For sample mean x̄:
Step 8: Effect of sample size If we used n = 16 instead: σₓ̄ = 8/√16 = 2 P(24 < x̄ < 26) = P(-0.5 < Z < 0.5) ≈ 0.38 Less likely to be close to μ with smaller sample
If we used n = 256 instead: σₓ̄ = 8/√256 = 0.5 P(24 < x̄ < 26) = P(-2 < Z < 2) ≈ 0.95 More likely to be close to μ with larger sample
Answer: Sampling distribution: x̄ ~ Normal(μ = 25, σ = 1) approximately
Despite the right-skewed population, the large sample size (n = 64) allows CLT to apply, making x̄ approximately normal.
P(24 < x̄ < 26) ≈ 0.683 or 68.3%
About 68% of samples will have means within 1 unit of the population mean.
The weights of carry-on luggage at an airport are heavily right-skewed with μ = 18 lbs and σ = 6 lbs. A flight has 100 passengers. What is the probability that the average luggage weight for these 100 passengers exceeds 19 lbs?
Step 1: Set up the problem Population (luggage weights):
Sample:
Step 2: Check CLT conditions Random: Assume passengers are representative sample ✓ Independent: 100 passengers << all passengers (10% rule) ✓ Sample size: n = 100 ≥ 30, even with heavy skew ✓
CLT applies!
Step 3: Find sampling distribution parameters μₓ̄ = μ = 18 lbs
σₓ̄ = σ/√n = 6/√100 = 6/10 = 0.6 lbs
x̄ ~ Normal(18, 0.6) approximately
Step 4: Calculate P(x̄ > 19) Standardize: z = (19 - 18)/0.6 = 1/0.6 = 5/3 ≈ 1.67
P(x̄ > 19) = P(Z > 1.67)
Step 5: Look up probability From standard normal table: P(Z < 1.67) ≈ 0.9525
Therefore: P(Z > 1.67) = 1 - 0.9525 = 0.0475
Step 6: Interpret Only about 4.75% chance (less than 5%) that average luggage weight exceeds 19 lbs.
Even though population is heavily skewed:
Step 7: Why this matters for airlines Airline might set weight limit based on average:
Individual approach would be harder:
Step 8: Compare to individual luggage For one random bag: P(X > 19) = ?
Can't easily calculate - population is skewed, not normal. But probably much higher than 4.75%! Maybe 30-40% of bags exceed 19 lbs.
But average of 100 bags rarely exceeds 19 lbs.
Step 9: Check reasonableness 19 lbs is 1 lb above mean In terms of SE: 19 = 18 + 1(0.6) = 18 + 1.67σₓ̄ About 1.67 SE above mean Should be fairly unlikely ✓
Answer: P(x̄ > 19) ≈ 0.048 or 4.8%
There's only about a 4.8% chance that the average luggage weight for 100 passengers exceeds 19 lbs. The Central Limit Theorem allows us to treat the sample mean as approximately normal despite the heavily skewed population, and the large sample size (n = 100) makes the sample mean much less variable than individual values.
A factory produces batteries with lifetimes that have μ = 500 hours and σ = 100 hours. Quality control tests samples of 50 batteries. What is the probability that a sample mean is more than 25 hours away from the true mean (in either direction)?
Step 1: Translate the question "More than 25 hours away from true mean" means: Either x̄ < 475 or x̄ > 525
Find: P(|x̄ - μ| > 25) = P(x̄ < 475) + P(x̄ > 525)
Step 2: Set up sampling distribution μ = 500 hours σ = 100 hours n = 50
Check CLT: n = 50 ≥ 30 ✓
Step 3: Find sampling distribution parameters μₓ̄ = μ = 500
σₓ̄ = σ/√n = 100/√50 = 100/7.07 ≈ 14.14 hours
x̄ ~ Normal(500, 14.14) approximately
Step 4: Use symmetry By symmetry of normal distribution: P(x̄ < 475) = P(x̄ > 525)
So: P(x̄ < 475 or x̄ > 525) = 2 × P(x̄ > 525)
Step 5: Calculate P(x̄ > 525) Standardize: z = (525 - 500)/14.14 = 25/14.14 ≈ 1.77
P(x̄ > 525) = P(Z > 1.77)
Step 6: Look up probability From table: P(Z < 1.77) ≈ 0.9616
Therefore: P(Z > 1.77) = 1 - 0.9616 = 0.0384
Step 7: Find total probability P(more than 25 away) = 2 × 0.0384 = 0.0768 ≈ 0.077
Step 8: Interpret About 7.7% of samples will have means more than 25 hours from the true mean.
This means:
Step 9: Express in terms of standard errors 25 hours = 1.77 × 14.14 ≈ 1.77 SE
So we're asking: P(more than 1.77 SE from mean)
From 68-95-99.7 rule:
Outside 1.77 SE: ≈7.7% ✓
Step 10: Decision rule for quality control Factory might use rule: "Flag sample if x̄ < 475 or x̄ > 525"
False alarm rate: 7.7% About 1 in 13 good samples will be flagged Reasonable tradeoff for quality control
Answer: P(|x̄ - μ| > 25) ≈ 0.077 or 7.7%
There's about a 7.7% probability that a sample mean will be more than 25 hours away from the true mean of 500 hours. This represents being more than 1.77 standard errors from the mean. Quality control can use this threshold to identify unusual samples that might indicate production problems.
An elevator has a maximum safe weight of 2000 lbs. If adult weights are normally distributed with μ = 180 lbs and σ = 30 lbs, what is the probability that 10 randomly selected adults will exceed the elevator's limit? What about 12 adults?
Step 1: Understand what we're finding For n adults, total weight = n × x̄ Want: P(total weight > 2000) Equivalently: P(n × x̄ > 2000) Or: P(x̄ > 2000/n)
Step 2: Set up for n = 10 Maximum average weight: 2000/10 = 200 lbs per person
Find: P(x̄ > 200) when n = 10
Step 3: Sampling distribution for n = 10 Population is normal, so x̄ is normal for ANY n (don't need CLT!)
μₓ̄ = μ = 180 lbs
σₓ̄ = σ/√n = 30/√10 = 30/3.16 ≈ 9.49 lbs
x̄ ~ Normal(180, 9.49)
Step 4: Calculate P(x̄ > 200) for n = 10 Standardize: z = (200 - 180)/9.49 = 20/9.49 ≈ 2.11
P(x̄ > 200) = P(Z > 2.11)
From table: P(Z < 2.11) ≈ 0.9826
P(Z > 2.11) = 1 - 0.9826 = 0.0174
Step 5: Interpret n = 10 result About 1.74% chance that 10 adults exceed 2000 lbs Fairly safe - less than 2% risk
Step 6: Set up for n = 12 Maximum average weight: 2000/12 ≈ 166.67 lbs per person
Find: P(x̄ > 166.67) when n = 12
Step 7: Sampling distribution for n = 12 μₓ̄ = 180 lbs
σₓ̄ = σ/√n = 30/√12 = 30/3.46 ≈ 8.66 lbs
x̄ ~ Normal(180, 8.66)
Step 8: Calculate P(x̄ > 166.67) for n = 12 Standardize: z = (166.67 - 180)/8.66 = -13.33/8.66 ≈ -1.54
P(x̄ > 166.67) = P(Z > -1.54)
From table: P(Z < -1.54) ≈ 0.0618
P(Z > -1.54) = 1 - 0.0618 = 0.9382
Step 9: Interpret n = 12 result About 93.8% chance that 12 adults exceed 2000 lbs! Very risky - almost certain to exceed limit
Step 10: Why such a big difference? n = 10: Need average > 200 lbs (20 lbs above μ) = 2.11 SE above mean Unlikely!
n = 12: Need average > 166.67 lbs (13.33 lbs below μ)
= 1.54 SE below mean
Very likely!
Step 11: Find maximum safe capacity At what n does P(exceed) = 0.05 (5% risk)?
Need: P(x̄ > 2000/n) = 0.05 P(Z > z) = 0.05 means z = 1.645
(2000/n - 180)/(30/√n) = -1.645
Solving: 2000/n = 180 - 1.645(30/√n) 2000 = 180n - 49.35√n
Approximately n ≈ 10.6
So maximum safe capacity is about 10 adults for 5% risk level.
Answer: n = 10: P(exceed 2000 lbs) ≈ 0.017 or 1.7% n = 12: P(exceed 2000 lbs) ≈ 0.938 or 93.8%
With 10 adults, there's only about 1.7% chance of exceeding the limit (relatively safe). With 12 adults, there's about 93.8% chance of exceeding the limit (very dangerous!). The maximum average weight needed drops from 200 lbs (n=10) to 166.67 lbs (n=12), and 166.67 is well below the population mean of 180, making it very likely to exceed.