Sampling Distributions

Distribution of sample statistics

Sampling Distributions

What is a Sampling Distribution?

Statistic: Number calculated from sample (e.g., sample mean xˉ\bar{x}, sample proportion p^\hat{p})

Sampling Distribution: Distribution of a statistic across all possible samples of size n

Key insight: Statistics vary from sample to sample (sampling variability). Sampling distribution describes this variability.

Example: Sampling Distribution of xˉ\bar{x}

Population: All students, μ = 70, σ = 10

Take many samples of n = 25:

  • Sample 1: xˉ1\bar{x}_1 = 72
  • Sample 2: xˉ2\bar{x}_2 = 68
  • Sample 3: xˉ3\bar{x}_3 = 71
  • ...

Plot all sample means → Sampling distribution of xˉ\bar{x}

Properties of Sampling Distribution of xˉ\bar{x}

Center:

μxˉ=μ\mu_{\bar{x}} = \mu

Sample mean is unbiased estimator of population mean

Spread:

σxˉ=σn\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}

Called standard error (SE)

Key: Larger sample → smaller standard error (more precise estimates)

Shape:

  • If population normal → sampling distribution exactly normal
  • If population not normal → approximately normal if n large enough (CLT)

Sampling Distribution of Sample Proportion

Population proportion: p
Sample proportion: p^=countn\hat{p} = \frac{\text{count}}{n}

Center:

μp^=p\mu_{\hat{p}} = p

Spread:

σp^=p(1p)n\sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}

Shape: Approximately normal if np ≥ 10 and n(1-p) ≥ 10

Example: Coin Flips

Fair coin (p = 0.5), n = 100 flips

Center: μp^\mu_{\hat{p}} = 0.5

Spread: σp^=0.5(0.5)100=0.0025=0.05\sigma_{\hat{p}} = \sqrt{\frac{0.5(0.5)}{100}} = \sqrt{0.0025} = 0.05

Shape: np = 50 ≥ 10, n(1-p) = 50 ≥ 10 → approximately normal

Interpretation: Sample proportions typically within 0.05 of true value 0.5

Bias vs Variability

Bias: Systematic over- or under-estimation

  • μstatistic\mu_{statistic} \neq parameter

Variability: Spread of sampling distribution

  • Measured by standard error

Ideal: Low bias AND low variability (unbiased with small SE)

Increase n:

  • Doesn't reduce bias
  • DOES reduce variability (SE decreases)

Standard Error

Standard Error (SE): Standard deviation of sampling distribution

For sample mean: SExˉ=σnSE_{\bar{x}} = \frac{\sigma}{\sqrt{n}}

For sample proportion: SEp^=p(1p)nSE_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}

Key pattern: SE ∝ 1/√n

To cut SE in half, need 4× sample size

Using Sampling Distributions

Find probabilities about statistics:

Example: Population μ = 100, σ = 15. Sample n = 25.

P(xˉ\bar{x} > 105) = ?

xˉN(100,15/25)=N(100,3)\bar{x} \sim N(100, 15/\sqrt{25}) = N(100, 3)

Standardize: z=1051003=1.67z = \frac{105-100}{3} = 1.67

P(Z > 1.67) ≈ 0.0475

Difference Between Two Means

Two independent samples:

μxˉ1xˉ2=μ1μ2\mu_{\bar{x}_1 - \bar{x}_2} = \mu_1 - \mu_2

σxˉ1xˉ2=σ12n1+σ22n2\sigma_{\bar{x}_1 - \bar{x}_2} = \sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}

Shape: Approximately normal if both samples meet conditions

Difference Between Two Proportions

μp^1p^2=p1p2\mu_{\hat{p}_1 - \hat{p}_2} = p_1 - p_2

σp^1p^2=p1(1p1)n1+p2(1p2)n2\sigma_{\hat{p}_1 - \hat{p}_2} = \sqrt{\frac{p_1(1-p_1)}{n_1} + \frac{p_2(1-p_2)}{n_2}}

Conditions: Each sample meets np ≥ 10 and n(1-p) ≥ 10

Simulating Sampling Distributions

Steps:

  1. Take sample of size n from population
  2. Calculate statistic
  3. Repeat many times
  4. Plot distribution of statistics

Result: Empirical approximation of theoretical sampling distribution

Common Misconceptions

❌ Confusing population distribution with sampling distribution
❌ Thinking larger sample reduces bias (only reduces variability)
❌ Forgetting √n in denominator of SE
❌ Using σ instead of σ/√n for xˉ\bar{x}

Quick Reference

Sampling Distribution of xˉ\bar{x}:

  • Center: μ
  • Spread: σ/√n
  • Shape: Normal (if population normal or n large)

Sampling Distribution of p^\hat{p}:

  • Center: p
  • Spread: √(p(1-p)/n)
  • Shape: Normal (if np ≥ 10 and n(1-p) ≥ 10)

Remember: Statistics vary from sample to sample. Sampling distribution describes this variability!

📚 Practice Problems

1Problem 1easy

Question:

What is a sampling distribution? How does it differ from the population distribution and sample distribution?

💡 Show Solution

Step 1: Define population distribution Distribution of individual values in the ENTIRE population

Example: Heights of ALL adults in US

  • Mean: μ = 68 inches
  • SD: σ = 3 inches
  • Shape: approximately normal

Step 2: Define sample distribution Distribution of values in ONE specific sample

Example: Heights of 50 adults we measured

  • Mean: x̄ = 67.5 inches (sample mean)
  • SD: s = 2.8 inches (sample SD)
  • Shape: approximately normal (like population)
  • This is ONE sample

Step 3: Define sampling distribution Distribution of a STATISTIC across ALL POSSIBLE samples

Example: Distribution of x̄ (sample mean) from all possible samples of size n = 50

  • This is NOT about individual heights
  • This is about SAMPLE MEANS
  • Each possible sample of 50 gives one x̄
  • Sampling distribution = distribution of all those x̄'s

Step 4: Key differences POPULATION DISTRIBUTION:

  • What: Individual values
  • Size: N (entire population)
  • Parameters: μ, σ
  • Usually don't know exactly

SAMPLE DISTRIBUTION:

  • What: Individual values in one sample
  • Size: n (one sample)
  • Statistics: x̄, s
  • Estimates population

SAMPLING DISTRIBUTION:

  • What: Sample statistics (like x̄) across all samples
  • Size: All possible samples
  • Parameters: μₓ̄, σₓ̄
  • Theoretical distribution

Step 5: Example with dice Population: All possible rolls of a die

  • Values: {1, 2, 3, 4, 5, 6}
  • μ = 3.5, σ = 1.71

Sample: One roll → got {4}

  • Just one value

Sampling distribution of x̄ for n = 2:

  • Take all possible pairs: (1,1), (1,2), ..., (6,6)
  • Calculate mean of each pair
  • Distribution of those means
  • μₓ̄ = 3.5 (same as population)
  • σₓ̄ = 1.71/√2 ≈ 1.21 (smaller than population)

Step 6: Why sampling distributions matter We take ONE sample and calculate x̄ We want to know: How far is our x̄ from μ?

Sampling distribution tells us:

  • Expected value of x̄
  • Variability of x̄
  • Shape of x̄ distribution
  • Allows us to make inferences!

Step 7: Visual representation Population: Individual heights: 62, 65, 68, 71, 74... (many values) Distribution: μ = 68, σ = 3

Sample (n=50): Individual heights in our sample: 66, 67, 69... (50 values) x̄ = 67.5

Sampling Distribution: All possible x̄'s from samples of size 50 Distribution: μₓ̄ = 68, σₓ̄ = 3/√50 ≈ 0.42 Shape: Normal (by CLT)

Answer: POPULATION DISTRIBUTION: Distribution of individual values in entire population (μ, σ).

SAMPLE DISTRIBUTION: Distribution of individual values in ONE specific sample (x̄, s).

SAMPLING DISTRIBUTION: Distribution of a sample statistic (like x̄) across ALL POSSIBLE samples of size n. Tells us how the statistic varies from sample to sample.

Key: Sampling distribution lets us understand variability of our sample statistics and make inferences about population parameters.

2Problem 2easy

Question:

A population has μ = 50 and σ = 12. If we take samples of size n = 36, what are the mean and standard deviation of the sampling distribution of x̄?

💡 Show Solution

Step 1: Identify given information Population parameters: μ = 50 σ = 12

Sample size: n = 36

Find: μₓ̄ and σₓ̄ (mean and SD of sampling distribution)

Step 2: Find mean of sampling distribution Formula: μₓ̄ = μ

The mean of the sampling distribution equals the population mean!

μₓ̄ = 50

Step 3: Why μₓ̄ = μ? Sample mean x̄ is an UNBIASED estimator of μ On average, x̄ equals μ Sometimes above, sometimes below But average of all possible x̄'s = μ

This is true regardless of sample size!

Step 4: Find standard deviation of sampling distribution Formula: σₓ̄ = σ/√n

Also called "standard error of the mean"

σₓ̄ = 12/√36 = 12/6 = 2

Step 5: Interpret σₓ̄ Standard deviation of sampling distribution = 2

This means:

  • Individual values vary with SD = 12
  • Sample means vary with SD = 2
  • Sample means are LESS variable than individuals!

Makes sense: averaging reduces variability

Step 6: Compare individual and sampling distributions INDIVIDUAL VALUES (population): μ = 50 σ = 12 Values spread out

SAMPLE MEANS (sampling distribution): μₓ̄ = 50 (same center) σₓ̄ = 2 (much less spread) Means cluster closer to μ

Step 7: Effect of sample size If we increased to n = 100: σₓ̄ = 12/√100 = 12/10 = 1.2 Even less variability!

If we decreased to n = 9: σₓ̄ = 12/√9 = 12/3 = 4 More variability

Larger samples → more precise estimates → smaller SE

Step 8: Visual comparison Population: σ = 12 |--|--|--|--|--|--|--| 26 32 38 44 50 56 62

Sampling distribution (n=36): σₓ̄ = 2
|------| 48 50 52

Sample means cluster much tighter around μ!

Answer: μₓ̄ = 50 σₓ̄ = 2

The sampling distribution of x̄ has the same mean as the population (50) but much smaller standard deviation (2 vs 12). Sample means are less variable than individual values - they cluster more tightly around the population mean.

3Problem 3medium

Question:

What does the Central Limit Theorem (CLT) state? Why is it important?

💡 Show Solution

Step 1: State the Central Limit Theorem For a random sample of size n from ANY population with mean μ and standard deviation σ:

As n increases, the sampling distribution of x̄ approaches a normal distribution with:

  • Mean: μₓ̄ = μ
  • Standard deviation: σₓ̄ = σ/√n

Regardless of the population's shape!

Step 2: Key components

  1. Works for ANY population distribution

    • Normal, skewed, uniform, bimodal, anything!
  2. Larger n → more normal

    • Rule of thumb: n ≥ 30 usually sufficient
    • If population is normal, works for any n
    • If population is very skewed, need larger n
  3. Gives us the parameters: μₓ̄ = μ, σₓ̄ = σ/√n

Step 3: Why it's remarkable Population could be:

  • Heavily skewed
  • Bimodal
  • Discrete
  • Any weird shape

But sampling distribution of x̄ is approximately NORMAL!

This is counterintuitive but proven mathematically.

Step 4: Example with dice Population: Uniform on {1, 2, 3, 4, 5, 6}

  • Discrete, rectangular shape
  • μ = 3.5, σ = 1.71

Sampling distribution of x̄:

  • n = 1: looks uniform (rectangular)
  • n = 5: starting to look bell-shaped
  • n = 30: very close to normal!
  • As n → ∞: perfectly normal

Step 5: Why CLT is important Allows us to use normal probabilities!

Even if we don't know population shape:

  • Can assume x̄ ~ Normal (if n large enough)
  • Can calculate P(x̄ in some range)
  • Can create confidence intervals
  • Can perform hypothesis tests

All based on normal distribution properties!

Step 6: Practical application Quality control: Measure sample mean weight

  • Individual boxes might be any distribution
  • But x̄ for n = 50 boxes is approximately normal
  • Can calculate P(x̄ is too far from target)

Medical: Average blood pressure in sample

  • Individual BP's vary unpredictably
  • But x̄ for n = 100 patients is approximately normal
  • Can make inferences about population mean

Step 7: Limitations CLT applies to: ✓ Sample mean x̄ ✓ Sample sum Σx (also becomes normal) ✓ Sample proportion p̂ (special case)

Does NOT apply to: ✗ Individual values (keep population shape) ✗ Sample median (different distribution) ✗ Sample maximum/minimum

Step 8: How large is "large enough"? General rules:

  • n ≥ 30: usually sufficient for CLT
  • Population normal: CLT works for any n
  • Population moderately skewed: n ≥ 15 okay
  • Population heavily skewed: need n ≥ 40 or more
  • Population has outliers: may need very large n

Answer: The Central Limit Theorem states that the sampling distribution of x̄ approaches a normal distribution with mean μ and standard deviation σ/√n as sample size increases, REGARDLESS of the population's shape.

Importance:

  1. Lets us use normal probabilities for x̄ even when population isn't normal
  2. Foundation for confidence intervals and hypothesis tests
  3. Explains why normal distribution appears so often in nature
  4. Works for almost any population (very general theorem)

This is perhaps the most important theorem in statistics!

4Problem 4medium

Question:

A population is right-skewed with μ = 80 and σ = 15. For samples of size n = 50, find the probability that x̄ is between 78 and 82.

💡 Show Solution

Step 1: Check if we can use normal approximation Population is right-skewed (not normal) But n = 50 ≥ 30 By Central Limit Theorem: sampling distribution of x̄ is approximately normal!

Step 2: Find parameters of sampling distribution μₓ̄ = μ = 80

σₓ̄ = σ/√n = 15/√50 = 15/7.07 ≈ 2.12

Step 3: Set up probability question Find: P(78 < x̄ < 82)

x̄ ~ Normal(μ = 80, σ = 2.12) approximately

Step 4: Standardize to z-scores z₁ = (78 - 80)/2.12 = -2/2.12 ≈ -0.94

z₂ = (82 - 80)/2.12 = 2/2.12 ≈ 0.94

Step 5: Find probability P(78 < x̄ < 82) = P(-0.94 < Z < 0.94)

Using standard normal table or symmetry: P(Z < 0.94) ≈ 0.8264 P(Z < -0.94) ≈ 0.1736

P(-0.94 < Z < 0.94) = 0.8264 - 0.1736 = 0.6528

Step 6: Interpret About 65.3% of samples of size 50 will have a sample mean between 78 and 82.

Even though population is skewed:

  • Individual values spread out (σ = 15)
  • Sample means cluster near μ = 80 (σₓ̄ = 2.12)
  • Distribution of x̄ is approximately normal

Step 7: Compare to individual values If we asked: P(78 < X < 82) for individual value?

Can't answer! We'd need the population distribution shape. Right-skewed means not symmetric, so normal approximation doesn't work for individuals.

But for x̄ with n = 50, CLT saves us - we CAN use normal!

Step 8: Verify reasonableness Range 78-82 is μ ± 2 In terms of SE: 80 ± 2(2.12) = 80 ± 4.24 Our range 78-82 is within about 1 SE

For normal: P(μ - 1σ < X < μ + 1σ) ≈ 0.68 Our answer 0.6528 ≈ 0.65 is close ✓

Answer: P(78 < x̄ < 82) ≈ 0.653 or 65.3%

Despite the population being right-skewed, the Central Limit Theorem allows us to treat the sampling distribution of x̄ as approximately normal (since n = 50 ≥ 30). About 65% of samples will have means within 2 units of the population mean.

5Problem 5hard

Question:

Two independent populations: Population A (μ = 100, σ = 20) and Population B (μ = 90, σ = 15). Take samples of n₁ = 40 from A and n₂ = 50 from B. Find the mean and standard deviation of the sampling distribution of x̄₁ - x̄₂. What is P(x̄₁ - x̄₂ > 15)?

💡 Show Solution

Step 1: Set up the problem Population A: μ₁ = 100, σ₁ = 20, n₁ = 40 Population B: μ₂ = 90, σ₂ = 15, n₂ = 50

Want distribution of: x̄₁ - x̄₂ (difference of sample means)

Step 2: Find mean of difference For independent samples: μₓ̄₁₋ₓ̄₂ = μ₁ - μ₂ = 100 - 90 = 10

Expected difference is 10.

Step 3: Find standard deviation of difference For independent samples: σₓ̄₁₋ₓ̄₂ = √(σ₁²/n₁ + σ₂²/n₂)

Calculate each term: σ₁²/n₁ = 20²/40 = 400/40 = 10 σ₂²/n₂ = 15²/50 = 225/50 = 4.5

σₓ̄₁₋ₓ̄₂ = √(10 + 4.5) = √14.5 ≈ 3.81

Step 4: Check CLT conditions n₁ = 40 ≥ 30 ✓ n₂ = 50 ≥ 30 ✓

By CLT: x̄₁ and x̄₂ are each approximately normal Therefore: x̄₁ - x̄₂ is approximately normal

x̄₁ - x̄₂ ~ Normal(μ = 10, σ = 3.81)

Step 5: Find P(x̄₁ - x̄₂ > 15) Standardize: z = (15 - 10)/3.81 = 5/3.81 ≈ 1.31

P(x̄₁ - x̄₂ > 15) = P(Z > 1.31)

Step 6: Look up probability From standard normal table: P(Z < 1.31) ≈ 0.9049

Therefore: P(Z > 1.31) = 1 - 0.9049 = 0.0951

Step 7: Interpret About 9.5% chance that sample mean from A exceeds sample mean from B by more than 15.

This makes sense:

  • Expected difference is only 10
  • 15 is (15-10)/3.81 ≈ 1.31 SE above expected
  • Fairly unlikely but not extremely rare

Step 8: Why variances add (not subtract) Even though we're finding difference of means, we ADD variances.

Why? Variability adds when combining random variables.

  • If x̄₁ varies: contributes to variation in difference
  • If x̄₂ varies: also contributes to variation in difference
  • Both sources of variation combine

Formula: Var(X - Y) = Var(X) + Var(Y) [for independent X, Y]

Step 9: Verify independence assumption Populations must be independent: ✓ Sample from A doesn't affect sample from B ✓ Different populations ✓ Random samples

If not independent (e.g., paired data), would need different approach!

Step 10: Summary of formulas used For independent samples:

  • μₓ̄₁₋ₓ̄₂ = μ₁ - μ₂
  • σₓ̄₁₋ₓ̄₂ = √(σ₁²/n₁ + σ₂²/n₂)
  • Distribution: approximately normal (if CLT applies)

Answer: μₓ̄₁₋ₓ̄₂ = 10 σₓ̄₁₋ₓ̄₂ ≈ 3.81 P(x̄₁ - x̄₂ > 15) ≈ 0.095 or 9.5%

The difference in sample means has a mean of 10 and standard deviation of about 3.81. There's about a 9.5% chance that the sample mean from Population A exceeds the sample mean from Population B by more than 15.