Normal Distributions

Properties and calculations with normal curves

Normal Distributions

Introduction

The normal distribution (also called Gaussian distribution or bell curve) is the most important probability distribution in statistics. Many natural phenomena approximately follow a normal distribution, and it forms the foundation for much of statistical inference.

Characteristics of Normal Distributions

Shape Properties

1. Bell-shaped curve:

  • Symmetric around the center
  • Single peak at the mean
  • Tails extend infinitely in both directions (approaching but never touching the x-axis)

2. Symmetric:

  • Left side mirrors right side
  • Mean = Median = Mode
  • If folded at center, both halves match perfectly

3. Unimodal:

  • Single peak (at the mean)
  • Highest point at center
  • Decreases smoothly on both sides

4. Asymptotic:

  • Tails get closer and closer to x-axis
  • Never actually reach zero
  • Theoretically extends to -\infty and ++\infty

The Normal Curve Equation

Probability density function:

f(x)=1σ2πe12(xμσ)2f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2}

Don't memorize this! Just know:

  • Defined by two parameters: μ\mu (mean) and σ\sigma (standard deviation)
  • Shape determined entirely by μ\mu and σ\sigma

Parameters: Mean and Standard Deviation

Mean (μ\mu)

Controls location:

  • Center of distribution
  • Peak of curve
  • Balance point

Changing μ\mu:

  • Shifts distribution left or right
  • Doesn't change shape
  • Doesn't change spread

Example:

  • Distribution A: μ=50\mu = 50, centered at 50
  • Distribution B: μ=70\mu = 70, centered at 70
  • B is shifted 20 units right from A

Standard Deviation (σ\sigma)

Controls spread:

  • How spread out distribution is
  • Width of bell curve
  • Distance from mean to inflection points

Changing σ\sigma:

  • Larger σ\sigma → wider, flatter curve
  • Smaller σ\sigma → narrower, taller curve
  • Doesn't change center
  • Total area under curve stays 1.0

Example:

  • Distribution A: σ=5\sigma = 5, narrow and tall
  • Distribution B: σ=15\sigma = 15, wide and flat
  • Both centered at same μ\mu, but B more spread out

The Empirical Rule (68-95-99.7 Rule)

For normal distributions:

68% of data within 1 standard deviation of mean
μ1σ\mu - 1\sigma to μ+1σ\mu + 1\sigma

95% of data within 2 standard deviations of mean
μ2σ\mu - 2\sigma to μ+2σ\mu + 2\sigma

99.7% of data within 3 standard deviations of mean
μ3σ\mu - 3\sigma to μ+3σ\mu + 3\sigma

Example Application

IQ scores: μ=100\mu = 100, σ=15\sigma = 15

68% of people have IQ between: 10015=85100 - 15 = 85 and 100+15=115100 + 15 = 115

95% of people have IQ between: 10030=70100 - 30 = 70 and 100+30=130100 + 30 = 130

99.7% of people have IQ between: 10045=55100 - 45 = 55 and 100+45=145100 + 45 = 145

Using the Empirical Rule

Quick mental calculations:

Example: Heights of adult males: μ=70\mu = 70 inches, σ=3\sigma = 3 inches

Q: What percentage between 67 and 73 inches?
A: 67 to 73 is μ±1σ\mu \pm 1\sigma68%

Q: What percentage above 76 inches?
A: 76 is μ+2σ\mu + 2\sigma, so 95% are between 64 and 76
Above 76 = (100% - 95%) / 2 = 2.5%

Q: What percentage below 64 inches?
A: 64 is μ2σ\mu - 2\sigma
Below 64 = (100% - 95%) / 2 = 2.5%

The Standard Normal Distribution (Z-distribution)

Definition

Standard Normal: Normal distribution with μ=0\mu = 0 and σ=1\sigma = 1

Denoted: N(0,1)N(0, 1) or Z-distribution

Why it matters:

  • Reference distribution
  • All normal distributions can be standardized to this
  • Tables and calculators use standard normal

Z-Scores

Z-score (standardized score): Number of standard deviations from the mean

Formula: z=xμσz = \frac{x - \mu}{\sigma}

Where:

  • xx = observed value
  • μ\mu = mean
  • σ\sigma = standard deviation

Interpretation:

  • z=0z = 0: At the mean
  • z=1z = 1: One SD above mean
  • z=1z = -1: One SD below mean
  • z=2.5z = 2.5: 2.5 SD above mean

Calculating Z-Scores

Example: Test scores with μ=75\mu = 75, σ=8\sigma = 8

Score of 83: z=83758=88=1z = \frac{83 - 75}{8} = \frac{8}{8} = 1

Score is 1 SD above mean

Score of 67: z=67758=88=1z = \frac{67 - 75}{8} = \frac{-8}{8} = -1

Score is 1 SD below mean

Score of 91: z=91758=168=2z = \frac{91 - 75}{8} = \frac{16}{8} = 2

Score is 2 SD above mean

Using Z-Scores

Purposes:

  1. Standardize different distributions for comparison
  2. Find probabilities using standard normal table
  3. Identify unusual values (typically |z| > 2 or 3)
  4. Compare across different scales

Example comparison:

Student A: Math score 85 (class μ=75\mu = 75, σ=8\sigma = 8)
z=85758=1.25z = \frac{85-75}{8} = 1.25

Student B: English score 88 (class μ=80\mu = 80, σ=5\sigma = 5)
z=88805=1.6z = \frac{88-80}{5} = 1.6

Student B performed better relative to their class (higher z-score)!

Finding Areas Under the Normal Curve

Methods

1. Empirical Rule (for z = ±1, ±2, ±3)

2. Standard Normal Table (z-table)

  • Gives area to LEFT of z-score
  • Also called cumulative probability

3. Calculator

  • normalcdf function
  • More accurate, easier

Using the Table

Area to the left of z:

  • Look up z in table directly
  • Example: z = 1.23 → area = 0.8907
  • Meaning: 89.07% of data below z = 1.23

Area to the right of z:

  • Area to right = 1 - area to left
  • Example: z = 1.23 → area to left = 0.8907
  • Area to right = 1 - 0.8907 = 0.1093 (10.93%)

Area between two z-scores:

  • Find area to left of each
  • Subtract smaller from larger
  • Example: Between z = -1 and z = 1
    • Area left of 1: 0.8413
    • Area left of -1: 0.1587
    • Between: 0.8413 - 0.1587 = 0.6826 (68.26%)

Calculator Method

TI-83/84:

normalcdf(lower, upper, mean, SD)

Examples:

Area between 65 and 75 (μ=70\mu = 70, σ=5\sigma = 5):
normalcdf(65, 75, 70, 5) → 0.6827

Area above 80:
normalcdf(80, 999999, 70, 5) → 0.0228

Area below 60:
normalcdf(-999999, 60, 70, 5) → 0.0228

Finding Values from Areas (Inverse Normal)

The Inverse Problem

Given: Probability (area)
Find: Corresponding x-value or z-score

Example: Find the score such that 90% of students score below it

Calculator Method

TI-83/84:

invNorm(area to left, mean, SD)

Examples:

90th percentile (μ=70\mu = 70, σ=5\sigma = 5):
invNorm(0.90, 70, 5) → 76.4

Meaning: 90% score below 76.4

25th percentile (Q1):
invNorm(0.25, 70, 5) → 66.6

75th percentile (Q3):
invNorm(0.75, 70, 5) → 73.4

Assessing Normality

Why It Matters

Many statistical methods assume normality. We need to check if data is approximately normal before applying these methods.

Methods to Assess Normality

1. Histogram/Dotplot:

  • Look for bell shape
  • Check for symmetry
  • Quick visual check

2. Normal Probability Plot (Q-Q Plot):

  • Plot observed values vs. expected normal values
  • If roughly linear → approximately normal
  • If curved or non-linear → not normal

3. Numerical Checks:

  • Mean ≈ Median (symmetry)
  • Few outliers by 1.5 × IQR rule
  • Most data within μ±2σ\mu \pm 2\sigma

What to Look For

Approximately normal: ✓ Bell-shaped histogram
✓ Linear normal probability plot
✓ Mean ≈ Median
✓ About 68% within 1 SD, 95% within 2 SD

Not normal: ❌ Skewed histogram
❌ Curved normal probability plot
❌ Mean ≠ Median
❌ Many outliers
❌ Gaps or multiple peaks

Common Mistakes

Assuming all data is normal
Many distributions are NOT normal!

Confusing z-scores with original values
z-scores are standardized, no units

Using empirical rule for non-normal data
Only valid for normal distributions

Forgetting to standardize before using table
Must convert to z-scores first

Reading wrong side of table
Most tables give area to LEFT

Not checking normality assumption
Methods based on normality won't work if data isn't normal

Quick Reference

Normal Distribution:

  • Parameters: μ\mu (mean), σ\sigma (SD)
  • Notation: N(μ,σ)N(\mu, \sigma)
  • Properties: Symmetric, bell-shaped, unimodal

Empirical Rule (68-95-99.7):

  • 68% within μ±1σ\mu \pm 1\sigma
  • 95% within μ±2σ\mu \pm 2\sigma
  • 99.7% within μ±3σ\mu \pm 3\sigma

Z-Score:

  • Formula: z=xμσz = \frac{x - \mu}{\sigma}
  • Interpretation: # of SDs from mean
  • Standard normal: μ=0\mu = 0, σ=1\sigma = 1

Calculator Commands:

  • normalcdf(lower, upper, μ, σ) for area/probability
  • invNorm(area, μ, σ) for x-value

Assessing Normality:

  • Histogram: bell-shaped?
  • Normal plot: linear?
  • Mean ≈ Median?

Remember: The normal distribution is powerful but not universal. Always check if the normality assumption is reasonable before using methods that require it!

📚 Practice Problems

No example problems available yet.