Continuous Random Variables
Probability density functions
Continuous Random Variables
Discrete vs Continuous
Discrete Random Variable: Countable values (0, 1, 2, ...)
Continuous Random Variable: Uncountable values in interval (all real numbers in range)
Examples of Continuous:
- Height, weight, temperature
- Time (seconds, measured precisely)
- Distance
- Any measurement on continuous scale
Key difference: For continuous variables, P(X = exact value) = 0!
Why P(X = c) = 0?
Infinite possible values in any interval
Probability spread across infinite points → each point has probability 0
Example: Height X uniformly distributed from 60 to 72 inches
- P(X = exactly 65.0000000... inches) = 0
- But P(64 < X < 66) > 0 (interval has positive probability)
Therefore: For continuous variables, focus on intervals, not exact values
Probability Density Function (PDF)
For continuous variable, use PDF: f(x)
Properties:
- f(x) ≥ 0 for all x
- Total area under curve = 1
- P(a < X < b) = area under f(x) from a to b
Key: f(x) is NOT a probability! (Can be > 1)
Area under curve gives probability
Finding Probabilities
P(a < X < b) = area under PDF from a to b
Methods:
- Geometry (for simple shapes: rectangles, triangles)
- Integration (for complex functions)
- Calculator/software (normalcdf for normal distribution)
Example: Uniform distribution on [0, 10]
f(x) = 1/10 for 0 ≤ x ≤ 10
P(3 < X < 7) = (7-3)(1/10) = 4/10 = 0.4
(Rectangle: width 4, height 1/10)
Continuous vs Discrete Probabilities
Discrete: P(X = 5) has meaning
Continuous:
- P(X = 5) = 0
- P(X < 5) = P(X ≤ 5) (no difference!)
- P(3 < X < 7) = P(3 ≤ X ≤ 7) = P(3 < X ≤ 7) = P(3 ≤ X < 7)
All intervals with same endpoints have same probability for continuous variables
Mean of Continuous Random Variable
Mean (Expected Value): or E(X)
For uniform distribution on [a, b]:
Example: X uniform on [0, 10]
General: Mean is "balance point" of distribution
Variance and Standard Deviation
For uniform distribution on [a, b]:
Example: X uniform on [0, 10]
Uniform Distribution
Simplest continuous distribution
PDF: f(x) = 1/(b-a) for a ≤ x ≤ b
Shape: Flat rectangle (constant height)
Properties:
- Mean: (a+b)/2
- All intervals of same length have same probability
- Symmetric around mean
Example: Bus arrives uniformly between 8:00 and 8:20 (20 minutes)
X = arrival time
- Mean: 10 minutes after 8:00
- P(arrive within first 5 min) = 5/20 = 0.25
- P(arrive between 8:05 and 8:15) = 10/20 = 0.5
Cumulative Distribution Function (CDF)
CDF: F(x) = P(X ≤ x)
For continuous: Integral of PDF from -∞ to x
Properties:
- Increasing function (never decreases)
- lim F(x) as x → -∞ = 0
- lim F(x) as x → ∞ = 1
Use: P(a < X < b) = F(b) - F(a)
Example: Uniform on [0, 10]
P(3 < X < 7) = F(7) - F(3) = 0.7 - 0.3 = 0.4
Normal Distribution (Preview)
Most important continuous distribution (covered in depth in other topic)
Bell-shaped curve
Characterized by: Mean (μ) and standard deviation (σ)
Notation: X ~ N(μ, σ)
Properties:
- Symmetric around mean
- 68-95-99.7 rule
- Use normalcdf on calculator
Percentiles and Quantiles
pth percentile: Value x where P(X ≤ x) = p
Quartiles:
- Q1 (25th percentile): P(X ≤ Q1) = 0.25
- Q2 (50th percentile, median): P(X ≤ Q2) = 0.5
- Q3 (75th percentile): P(X ≤ Q3) = 0.75
Example: Uniform on [0, 10]
Median = 5 (P(X ≤ 5) = 0.5)
Q1 = 2.5 (P(X ≤ 2.5) = 0.25)
Q3 = 7.5 (P(X ≤ 7.5) = 0.75)
Linear Transformations
Same rules as discrete:
If Y = a + bX:
Example: Temperature
- C uniform on [0, 40]
- F = 32 + 1.8C
- μ_C = 20, σ_C = 40/√12
- μ_F = 32 + 1.8(20) = 68
- σ_F = 1.8(40/√12) = 72/√12
Combining Independent Variables
Same rules as discrete:
If X and Y independent:
- μ_{X+Y} = μ_X + μ_Y
- σ²_{X+Y} = σ²_X + σ²_Y
- μ_{X-Y} = μ_X - μ_Y
- σ²_{X-Y} = σ²_X + σ²_Y (variances add!)
Common Continuous Distributions
Uniform: Constant PDF on interval
Normal (Gaussian): Bell curve
Exponential: Models time until event (memoryless)
t-distribution: Similar to normal, heavier tails (used in inference)
Chi-square: Used in hypothesis testing
Common Mistakes
❌ Thinking P(X = c) has meaning for continuous X
❌ Interpreting f(x) as probability (it's density, not probability)
❌ Confusing P(X < c) with P(X ≤ c) (they're equal for continuous!)
❌ Adding standard deviations instead of variances
❌ Using discrete formulas for continuous variables
Practice Strategy
- Identify: Continuous or discrete?
- Determine distribution: Uniform? Normal? Other?
- Find parameters: Mean, SD, or a and b for uniform
- Calculate probability: Use area/geometry or calculator
- Check: Does answer make sense (between 0 and 1)?
Quick Reference
Continuous: Uncountable values, P(X = c) = 0
PDF: f(x) gives density (not probability)
Area under PDF gives probability
Uniform [a,b]:
- Mean: (a+b)/2
- SD: (b-a)/√12
- P in interval: length/total length
Key: P(a < X < b) = P(a ≤ X ≤ b) for continuous
Remember: For continuous variables, focus on intervals and areas under curves, not individual point probabilities!
📚 Practice Problems
1Problem 1easy
❓ Question:
What is the key difference between discrete and continuous random variables? Give an example of each.
💡 Show Solution
Step 1: Define discrete random variable Takes on COUNTABLE values (finite or countably infinite) Can list all possible values Probability can be assigned to individual values
Examples:
- Number of heads in 10 coin flips: {0, 1, 2, ..., 10}
- Number of students in a class: {0, 1, 2, 3, ...}
- Number of cars in parking lot: {0, 1, 2, 3, ...}
Step 2: Define continuous random variable Takes on UNCOUNTABLE values in an interval Infinite possible values in any range CANNOT assign probability to individual values Probability assigned to RANGES/INTERVALS
Examples:
- Height of a person: any value in (0, 8) feet
- Time until light bulb fails: any value in (0, ∞) hours
- Temperature: any value in (-∞, ∞) degrees
Step 3: Key difference #1 - Possible values Discrete: Countable, can list them Continuous: Uncountable, cannot list all
Step 4: Key difference #2 - Probabilities Discrete: P(X = x) can be nonzero
- P(X = 3) might equal 0.2
Continuous: P(X = x) = 0 for any specific value!
- P(X = 3.0) = 0
- P(X = exactly 3.000...) = 0
Why? Infinitely many values means each has infinitesimal probability
Step 5: How to find probabilities for continuous? Use INTERVALS, not individual values:
- P(a < X < b) = area under probability density curve
- P(2.5 < X < 3.5) might equal 0.2
Step 6: Probability density function (PDF) Discrete: Probability mass function (PMF)
- Bar graph
- Heights represent probabilities
- Sum of all bars = 1
Continuous: Probability density function (PDF)
- Smooth curve
- Area under curve represents probability
- Total area = 1
Step 7: Example comparison Discrete: X = number shown on die roll P(X = 3) = 1/6
Continuous: X = time (in seconds) until die stops rolling X could be 1.23456... or 2.71828... seconds P(X = exactly 2.0) = 0 P(2.0 < X < 2.5) = some positive probability
Answer: DISCRETE: Countable values, can assign probability to individual values. Example: number of heads in coin flips.
CONTINUOUS: Uncountable values in an interval, P(X = specific value) = 0, must use intervals. Example: height, time, temperature.
Key: For continuous variables, probabilities are only meaningful for RANGES, not exact values.
2Problem 2easy
❓ Question:
For a continuous random variable, explain why P(X = 5) = 0, but we can still have a nonzero probability for P(4 < X < 6).
💡 Show Solution
Step 1: Understand the continuous probability model Continuous variable: Infinitely many possible values Between any two numbers, there are infinitely more numbers!
Between 4 and 6: 4.1, 4.01, 4.001, 4.0001, ... Also 4.5, 4.55, 4.555, 4.5555, ... Literally uncountably infinite values
Step 2: Why P(X = 5) = 0 Probability is "spread out" over infinitely many values Each individual value gets infinitesimal probability
Think of probability like mass:
- Total mass = 1 (total probability)
- Spread over infinite points
- Each point gets mass 0
Mathematically: P(X = 5) = 0
Step 3: But this doesn't mean impossible! P(X = 5) = 0 doesn't mean "can't happen" It means "so unlikely it has probability 0"
This is different from discrete case where P(X = x) > 0
Step 4: Why P(4 < X < 6) can be nonzero This is an INTERVAL with infinitely many points Sum of infinitely many infinitesimal probabilities = positive probability
Geometric interpretation:
- Probability = area under PDF curve
- P(X = 5) = area of vertical line at x = 5 = 0 (line has no width)
- P(4 < X < 6) = area under curve from 4 to 6 > 0 (region has width)
Step 5: Visual example with uniform distribution Suppose X is uniformly distributed on [0, 10]
PDF: f(x) = 1/10 for 0 ≤ x ≤ 10 (constant height = 0.1, total area = 0.1 × 10 = 1)
P(X = 5) = area of line at x = 5 = (no width) × 0.1 = 0
P(4 < X < 6) = area of rectangle from 4 to 6 = (width 2) × (height 0.1) = 0.2
Step 6: Important consequences For continuous random variables: P(X < 5) = P(X ≤ 5) [including/excluding boundary doesn't matter] P(X = 5) = 0, so adding it doesn't change anything
P(4 < X < 6) = P(4 ≤ X < 6) = P(4 < X ≤ 6) = P(4 ≤ X ≤ 6) All the same!
This is NOT true for discrete variables.
Step 7: Analogy Imagine throwing a dart at a number line from 0 to 10:
- P(hit exactly 5.000000...) = 0 (would need perfect precision)
- P(hit between 4 and 6) > 0 (reasonable target range)
Answer: P(X = 5) = 0 because probability is spread over infinitely many values, each getting infinitesimal (zero) probability. It's like asking for the area of a single point - a point has no width, so area = 0.
P(4 < X < 6) > 0 because it's an INTERVAL containing infinitely many points. The probability is the area under the probability density curve over this interval, which has width and therefore positive area.
For continuous variables: probabilities are only meaningful for INTERVALS, not individual values.
3Problem 3medium
❓ Question:
X is uniformly distributed on [2, 10]. Find P(X < 5), P(X > 7), and P(4 ≤ X ≤ 6). Also find the mean and standard deviation.
💡 Show Solution
Step 1: Set up uniform distribution X ~ Uniform(a = 2, b = 10) Range: [2, 10] Width: 10 - 2 = 8
Step 2: Understand uniform PDF For uniform distribution on [a,b]: f(x) = 1/(b-a) for a ≤ x ≤ b f(x) = 0 otherwise
For our distribution: f(x) = 1/8 for 2 ≤ x ≤ 10
Height = 1/8 = 0.125 Total area = (width 8)(height 1/8) = 1 ✓
Step 3: Find P(X < 5) Probability = area under curve from 2 to 5
P(X < 5) = (width)(height) = (5 - 2)(1/8) = 3/8 = 0.375
Step 4: Find P(X > 7) Area from 7 to 10
P(X > 7) = (10 - 7)(1/8) = 3/8 = 0.375
Step 5: Find P(4 ≤ X ≤ 6) Area from 4 to 6
P(4 ≤ X ≤ 6) = (6 - 4)(1/8) = 2/8 = 1/4 = 0.25
Step 6: Calculate mean For uniform distribution: μ = (a + b)/2
μ = (2 + 10)/2 = 12/2 = 6
Middle of the interval!
Step 7: Calculate variance and standard deviation For uniform distribution: σ² = (b - a)²/12
σ² = (10 - 2)²/12 = 64/12 = 16/3 ≈ 5.33
σ = √(16/3) = 4/√3 = (4√3)/3 ≈ 2.31
Step 8: Verify probabilities sum correctly Should be able to partition interval:
P(X < 5) = 3/8 P(5 ≤ X ≤ 7) = (7-5)/8 = 2/8 P(X > 7) = 3/8
Total: 3/8 + 2/8 + 3/8 = 8/8 = 1 ✓
Answer: P(X < 5) = 3/8 = 0.375 P(X > 7) = 3/8 = 0.375 P(4 ≤ X ≤ 6) = 1/4 = 0.25 Mean: μ = 6 Standard Deviation: σ = (4√3)/3 ≈ 2.31
For uniform distribution, probabilities are proportional to interval lengths, and the mean is the midpoint of the range.
4Problem 4medium
❓ Question:
The lifetime (in years) of a light bulb follows an exponential distribution with mean 5 years. What is the probability a bulb lasts more than 5 years? More than 10 years?
💡 Show Solution
Step 1: Set up exponential distribution For exponential distribution with mean μ: λ = 1/μ (rate parameter)
Given: μ = 5 years Therefore: λ = 1/5 = 0.2
Step 2: Recall exponential CDF For exponential with rate λ: P(X ≤ x) = 1 - e^(-λx) for x ≥ 0
Therefore: P(X > x) = e^(-λx)
Step 3: Find P(X > 5) P(X > 5) = e^(-λ·5) = e^(-0.2·5) = e^(-1) ≈ 0.368
Step 4: Interpret P(X > 5) About 36.8% of bulbs last more than 5 years Even though mean is 5 years!
This seems counterintuitive, but exponential is right-skewed:
- Many bulbs fail early
- Few bulbs last much longer
- Mean is pulled up by the long-lasting outliers
Step 5: Find P(X > 10) P(X > 10) = e^(-λ·10) = e^(-0.2·10) = e^(-2) ≈ 0.135
About 13.5% of bulbs last more than 10 years
Step 6: Interesting property Notice: P(X > 10) = e^(-2) = (e^(-1))² = [P(X > 5)]²
This is the MEMORYLESS property! P(X > 10 | X > 5) = P(X > 5)
Given a bulb has lasted 5 years, probability it lasts another 5 years is the same as a new bulb lasting 5 years!
Step 7: Calculate median for comparison Median: value where P(X ≤ m) = 0.5
1 - e^(-λm) = 0.5 e^(-λm) = 0.5 -λm = ln(0.5) m = -ln(0.5)/λ m = ln(2)/0.2 m = 0.693/0.2 m ≈ 3.47 years
Median (3.47) < Mean (5) confirms right-skewed distribution
Step 8: Probability within one SD For exponential: σ = μ = 5 (SD equals mean!)
P(μ - σ < X < μ + σ) = P(0 < X < 10) [can't be negative] = 1 - e^(-2) ≈ 0.865
About 86.5% within one SD (different from normal's 68%!)
Answer: P(X > 5) = e^(-1) ≈ 0.368 or 36.8% P(X > 10) = e^(-2) ≈ 0.135 or 13.5%
Counterintuitively, less than half of bulbs last longer than the mean lifetime (5 years) because the exponential distribution is heavily right-skewed.
5Problem 5hard
❓ Question:
Height is normally distributed with μ = 68 inches and σ = 3 inches. Find P(65 < X < 71), P(X > 74), and the 90th percentile.
💡 Show Solution
Step 1: Set up normal distribution X ~ N(μ = 68, σ = 3)
Step 2: Find P(65 < X < 71) Convert to z-scores: z₁ = (65 - 68)/3 = -3/3 = -1 z₂ = (71 - 68)/3 = 3/3 = 1
P(65 < X < 71) = P(-1 < Z < 1)
From empirical rule (68-95-99.7): About 68% of data within 1 SD of mean
P(-1 < Z < 1) ≈ 0.68
More precisely (from table): 0.6827
Step 3: Find P(X > 74) Convert to z-score: z = (74 - 68)/3 = 6/3 = 2
P(X > 74) = P(Z > 2)
From empirical rule: About 95% within 2 SD So about 5% outside 2 SD Half of that (2.5%) is above
P(Z > 2) ≈ 0.025
More precisely (from table): 0.0228
Step 4: Find 90th percentile Want value x where P(X < x) = 0.90
First, find z-score for 90th percentile: P(Z < z) = 0.90 From table: z ≈ 1.28
Step 5: Convert z-score back to x x = μ + zσ x = 68 + 1.28(3) x = 68 + 3.84 x = 71.84 inches
Step 6: Verify 90th percentile 90% of people shorter than 71.84 inches 10% of people taller than 71.84 inches
This is about 71.84 - 68 = 3.84 inches above mean = 3.84/3 = 1.28 standard deviations above mean ✓
Step 7: Summary using empirical rule Within 1 SD (65-71): ~68% Within 2 SD (62-74): ~95% Within 3 SD (59-77): ~99.7%
Above mean+2SD (>74): ~2.5% 90th percentile: mean + 1.28 SD ≈ 71.84
Step 8: Visualize 2.5% 68% 25% 2.5% |------|-------------|-------|------| 62 65 68 71 74 77 -2σ -1σ μ +1σ +2σ
Answer: P(65 < X < 71) ≈ 0.68 or 68% P(X > 74) ≈ 0.025 or 2.5% 90th percentile ≈ 71.84 inches
The first interval captures about 68% because it's within one standard deviation of the mean. Heights above 74 inches (2 SD above mean) are rare at about 2.5%.
Practice with Flashcards
Review key concepts with our flashcard system
Browse All Topics
Explore other calculus topics