Discrete Random Variables
Probability distributions for discrete variables
Discrete Random Variables
What is a Random Variable?
Random Variable: Variable whose value is determined by outcome of random process
Notation: Usually capital letters (X, Y, Z)
Discrete Random Variable: Takes on countable set of values (often integers)
Examples:
- X = number of heads in 3 coin flips (X can be 0, 1, 2, or 3)
- Y = number of students absent (Y can be 0, 1, 2, ...)
- Z = sum when rolling two dice (Z can be 2, 3, ..., 12)
Probability Distribution
Probability Distribution: Lists all possible values and their probabilities
Requirements:
- Each probability between 0 and 1: 0 ≤ P(X = x) ≤ 1
- Probabilities sum to 1: ΣP(X = x) = 1
Example: Flip coin 2 times, X = number of heads
| X | P(X = x) | |---|----------| | 0 | 0.25 | | 1 | 0.50 | | 2 | 0.25 |
Sum: 0.25 + 0.50 + 0.25 = 1 ✓
Mean of Discrete Random Variable
Mean (Expected Value): or E(X)
Formula:
Interpretation: Long-run average if process repeated many times
Example: X = number of heads in 2 flips
Interpretation: On average, expect 1 head in 2 flips
Note: Mean doesn't have to be possible value! (E.g., average family has 2.3 children)
Variance and Standard Deviation
Variance: or Var(X)
Formula:
Alternative (easier calculation):
Standard Deviation:
Example: X = heads in 2 flips (μ_X = 1)
Alternative calculation:
Linear Transformations
If Y = a + bX:
Note: Adding constant shifts mean but doesn't change spread. Multiplying affects both.
Example: X = quiz score (0-10), μ_X = 7, σ_X = 2 Convert to percentage: Y = 10X
Example: Temperature conversion F = 32 + 1.8C If μ_C = 20°C, σ_C = 5°C:
Combining Independent Random Variables
If X and Y are independent:
Sum: Z = X + Y
Difference: W = X - Y (variances always add!)
Example: X = score on test 1 (μ = 80, σ = 10) Y = score on test 2 (μ = 75, σ = 12) Total = X + Y
Key: Standard deviations don't add; variances do!
Expected Value Applications
Fair game: E(winnings) = 0
Example: Pay 5 if roll 6, $0 otherwise.
Fair game!
Expected profit/loss:
Example: Lottery ticket costs 1,000,000, probability 1/1,000,000
Expect to lose $1 per ticket on average
Probability Histogram
Visual representation of probability distribution
- X-axis: Values of X
- Y-axis: Probabilities
- Height of bar = P(X = x)
- Bars don't touch (discrete)
Properties:
- Area of bar = probability
- Total area = 1
Common Notation
P(X = 3): Probability X equals 3
P(X ≤ 3): Probability X is at most 3 (cumulative)
P(X < 3): Probability X is less than 3
P(2 ≤ X ≤ 5): Probability X is between 2 and 5 inclusive
For discrete variables: P(X ≤ 3) includes P(X = 3)
Cumulative Distribution Function (CDF)
CDF: P(X ≤ x)
Sum probabilities up to and including x
Example: X has distribution in earlier example
P(X ≤ 1) = P(X = 0) + P(X = 1) = 0.25 + 0.50 = 0.75
Common Discrete Distributions
Binomial: Fixed trials, success/failure, constant probability
Geometric: Trials until first success
Poisson: Count of events in interval
(Each has its own topic with specific formulas!)
Common Mistakes
❌ Forgetting probabilities must sum to 1
❌ Confusing E(X) with most likely value
❌ Adding standard deviations instead of variances
❌ Forgetting absolute value for σ_Y when Y = a + bX
❌ Using variance formula when independence doesn't hold
Practice Strategy
- List: All possible values
- Find: Probability for each value
- Verify: Probabilities sum to 1
- Calculate: μ and σ using formulas
- Interpret: What do mean and SD tell us?
Quick Reference
Mean:
Variance:
Linear Transform: Y = a + bX gives μ_Y = a + bμ_X, σ_Y = |b|σ_X
Sum/Difference: μ adds/subtracts, variances always add
Remember: Mean is long-run average. Standard deviation measures variability. For sums/differences of independent variables, variances add!
📚 Practice Problems
1Problem 1easy
❓ Question:
Let X be the number of heads when flipping a fair coin 3 times. Create the probability distribution table for X and verify it is a valid probability distribution.
💡 Show Solution
Step 1: List all possible outcomes Sample space for 3 flips: HHH, HHT, HTH, HTT, THH, THT, TTH, TTT Total: 8 outcomes
Step 2: Count heads in each outcome HHH → 3 heads HHT → 2 heads HTH → 2 heads HTT → 1 head THH → 2 heads THT → 1 head TTH → 1 head TTT → 0 heads
Step 3: Create frequency distribution X = 0: 1 outcome (TTT) X = 1: 3 outcomes (HTT, THT, TTH) X = 2: 3 outcomes (HHT, HTH, THH) X = 3: 1 outcome (HHH)
Step 4: Calculate probabilities P(X = 0) = 1/8 = 0.125 P(X = 1) = 3/8 = 0.375 P(X = 2) = 3/8 = 0.375 P(X = 3) = 1/8 = 0.125
Step 5: Create probability distribution table ┌───────┬─────────┐ │ X │ P(X) │ ├───────┼─────────┤ │ 0 │ 0.125 │ │ 1 │ 0.375 │ │ 2 │ 0.375 │ │ 3 │ 0.125 │ └───────┴─────────┘
Step 6: Verify conditions for valid distribution Condition 1: All probabilities between 0 and 1? 0.125, 0.375, 0.375, 0.125 are all in [0,1] ✓
Condition 2: Sum of probabilities = 1? 0.125 + 0.375 + 0.375 + 0.125 = 1.000 ✓
This IS a valid probability distribution!
Answer: Probability Distribution: X: 0 1 2 3 P(X): 1/8 3/8 3/8 1/8
This is valid because all probabilities are between 0 and 1, and they sum to 1.
2Problem 2easy
❓ Question:
Given the probability distribution: X: 1, 2, 3, 4 P(X): 0.1, 0.3, 0.4, 0.2
Find the expected value E(X) and interpret it.
💡 Show Solution
Step 1: Recall expected value formula E(X) = μ = Σ[x · P(x)] Sum of each value times its probability
Step 2: Calculate E(X) E(X) = (1)(0.1) + (2)(0.3) + (3)(0.4) + (4)(0.2) = 0.1 + 0.6 + 1.2 + 0.8 = 2.7
Step 3: Interpret the expected value E(X) = 2.7 is the long-run average value
This means:
- If we repeat this random process many times
- The average value of X will approach 2.7
- It's the "center" of the distribution
Step 4: Important notes
- E(X) = 2.7 is NOT a possible value of X
- X can only be 1, 2, 3, or 4
- But the average over many trials is 2.7
- It's like saying "average family has 2.3 children"
Step 5: Verify calculation Check probabilities sum to 1: 0.1 + 0.3 + 0.4 + 0.2 = 1.0 ✓
Check E(X) is within range: min(X) = 1, max(X) = 4 1 ≤ 2.7 ≤ 4 ✓
Step 6: Visual interpretation The expected value is the "balance point"
P(X): 0.4│ █ 0.3│ █ █ 0.2│ █ █ █ 0.1│ █ █ █ █ └───────── 1 2 3 4
Balance point at 2.7 (closer to 3 due to higher probability there)
Answer: E(X) = 2.7
This means if we repeatedly observe this random variable, the long-run average value will be 2.7. It's the weighted average of all possible values, weighted by their probabilities.
3Problem 3medium
❓ Question:
A game costs 20; if you roll a 4 or 5, you win $10; otherwise you win nothing. What is the expected value of your net winnings? Should you play this game?
💡 Show Solution
Step 1: Define the random variable Let X = net winnings (winnings minus cost)
Step 2: Identify all outcomes and net winnings Roll 6: Win 5 → Net = 5 = 10, paid 10 - 5 Roll 1, 2, or 3: Win 5 → Net = 5 = -$5
Step 3: Find probabilities P(X = 15) = P(roll 6) = 1/6 P(X = 5) = P(roll 4 or 5) = 2/6 = 1/3 P(X = -5) = P(roll 1, 2, or 3) = 3/6 = 1/2
Step 4: Verify probability distribution 1/6 + 1/3 + 1/2 = 1/6 + 2/6 + 3/6 = 6/6 = 1 ✓
Step 5: Calculate expected value E(X) = (15)(1/6) + (5)(1/3) + (-5)(1/2) = 15/6 + 5/3 - 5/2 = 15/6 + 10/6 - 15/6 = 10/6 = 5/3 ≈ $1.67
Step 6: Interpret the expected value E(X) = $1.67 per game
This means:
- On average, you GAIN $1.67 per game
- In the long run, you expect to profit
- The game is in your favor!
Step 7: Should you play? YES! Expected value is POSITIVE ($1.67)
- You expect to win money on average
- This is a favorable game
However, consider:
- Short-term: You could still lose (50% chance of losing $5)
- Need many plays for average to materialize
- Variance matters too (not just expected value)
Step 8: Calculate long-run expectation If you play 100 times: Expected total net winnings = 100 × 167
Answer: E(X) = $1.67 per game
You SHOULD play this game because the expected value is positive. On average, you expect to win 15, win 5 on any single play.
4Problem 4medium
❓ Question:
Given probability distribution with E(X) = 4 and Var(X) = 3. Find: a) E(2X + 5) b) Var(2X + 5) c) SD(2X + 5)
💡 Show Solution
Step 1: Recall transformation rules for expected value E(aX + b) = a·E(X) + b
Where:
- a is multiplied through
- b is added at the end
Step 2: Calculate E(2X + 5) E(2X + 5) = 2·E(X) + 5 = 2(4) + 5 = 8 + 5 = 13
Step 3: Recall transformation rules for variance Var(aX + b) = a²·Var(X)
Important:
- Multiply by a² (not just a)
- Adding constant b doesn't affect variance!
- Variance measures spread - shifting doesn't change spread
Step 4: Calculate Var(2X + 5) Var(2X + 5) = 2²·Var(X) = 4·Var(X) = 4(3) = 12
Step 5: Calculate SD(2X + 5) SD(2X + 5) = √Var(2X + 5) = √12 = 2√3 ≈ 3.46
Alternative: SD(aX + b) = |a|·SD(X) SD(2X + 5) = 2·SD(X) = 2·√3 = 2√3 ✓
Step 6: Summary of transformation rules For Y = aX + b:
E(Y) = a·E(X) + b Var(Y) = a²·Var(X) SD(Y) = |a|·SD(X)
Key insights:
- Adding constant: shifts mean, doesn't affect variance
- Multiplying: scales mean by a, scales variance by a²
- Standard deviation scales by |a|
Step 7: Verify with original values E(X) = 4, Var(X) = 3, SD(X) = √3
Transform by 2X + 5:
- Mean shifts from 4 to 13 (doubled then added 5)
- Variance scales from 3 to 12 (multiplied by 2² = 4)
- SD scales from √3 to 2√3 (multiplied by 2)
Answer: a) E(2X + 5) = 13 b) Var(2X + 5) = 12 c) SD(2X + 5) = 2√3 ≈ 3.46
Key insight: Adding a constant affects the mean but not the variance or SD. Multiplying by a constant affects both, but variance is multiplied by a² while SD is multiplied by |a|.
5Problem 5hard
❓ Question:
You and a friend independently roll a fair die. Let X be the result of your roll and Y be the result of your friend's roll. Find E(X + Y), Var(X + Y), and E(XY). Are the answers different than for a single die?
💡 Show Solution
Step 1: Find E(X) and E(Y) for single die For fair die: X can be 1, 2, 3, 4, 5, 6 Each with probability 1/6
E(X) = (1 + 2 + 3 + 4 + 5 + 6)/6 = 21/6 = 3.5
Since Y is also a die roll: E(Y) = 3.5
Step 2: Find E(X + Y) For independent random variables: E(X + Y) = E(X) + E(Y)
E(X + Y) = 3.5 + 3.5 = 7
This is the expected sum of two dice!
Step 3: Find Var(X) First calculate E(X²): E(X²) = (1² + 2² + 3² + 4² + 5² + 6²)/6 = (1 + 4 + 9 + 16 + 25 + 36)/6 = 91/6
Var(X) = E(X²) - [E(X)]² = 91/6 - (3.5)² = 91/6 - 12.25 = 91/6 - 49/4 = 182/12 - 147/12 = 35/12 ≈ 2.917
Step 4: Find Var(X + Y) For INDEPENDENT random variables: Var(X + Y) = Var(X) + Var(Y)
Since X and Y are independent and identical: Var(Y) = 35/12
Var(X + Y) = 35/12 + 35/12 = 70/12 = 35/6 ≈ 5.833
Note: If NOT independent, would need Cov(X,Y) term
Step 5: Find E(XY) For INDEPENDENT random variables: E(XY) = E(X) · E(Y)
E(XY) = 3.5 × 3.5 = 12.25
Step 6: Compare to single die Single die:
- E(X) = 3.5
- Var(X) = 35/12 ≈ 2.917
Sum of two dice:
- E(X + Y) = 7 (double the mean!)
- Var(X + Y) = 35/6 ≈ 5.833 (double the variance!)
- SD(X + Y) = √(35/6) ≈ 2.415 (NOT double!)
Product of two dice:
- E(XY) = 12.25
Step 7: Key formulas used For independent X and Y: ✓ E(X + Y) = E(X) + E(Y) ✓ Var(X + Y) = Var(X) + Var(Y) ✓ E(XY) = E(X) · E(Y)
Answer: E(X + Y) = 7 (expected sum is 7) Var(X + Y) = 35/6 ≈ 5.833 E(XY) = 12.25 (expected product is 12.25)
Yes, these are different! The sum has double the expected value and double the variance of a single die. The product's expected value is the product of the individual expected values (because independent).
Practice with Flashcards
Review key concepts with our flashcard system
Browse All Topics
Explore other calculus topics