Discrete Random Variables

Probability distributions for discrete variables

Discrete Random Variables

What is a Random Variable?

Random Variable: Variable whose value is determined by outcome of random process

Notation: Usually capital letters (X, Y, Z)

Discrete Random Variable: Takes on countable set of values (often integers)

Examples:

  • X = number of heads in 3 coin flips (X can be 0, 1, 2, or 3)
  • Y = number of students absent (Y can be 0, 1, 2, ...)
  • Z = sum when rolling two dice (Z can be 2, 3, ..., 12)

Probability Distribution

Probability Distribution: Lists all possible values and their probabilities

Requirements:

  1. Each probability between 0 and 1: 0 ≤ P(X = x) ≤ 1
  2. Probabilities sum to 1: ΣP(X = x) = 1

Example: Flip coin 2 times, X = number of heads

| X | P(X = x) | |---|----------| | 0 | 0.25 | | 1 | 0.50 | | 2 | 0.25 |

Sum: 0.25 + 0.50 + 0.25 = 1 ✓

Mean of Discrete Random Variable

Mean (Expected Value): μX\mu_X or E(X)

Formula:

μX=E(X)=xP(X=x)\mu_X = E(X) = \sum x \cdot P(X = x)

Interpretation: Long-run average if process repeated many times

Example: X = number of heads in 2 flips

μX=0(0.25)+1(0.50)+2(0.25)=0+0.50+0.50=1\mu_X = 0(0.25) + 1(0.50) + 2(0.25) = 0 + 0.50 + 0.50 = 1

Interpretation: On average, expect 1 head in 2 flips

Note: Mean doesn't have to be possible value! (E.g., average family has 2.3 children)

Variance and Standard Deviation

Variance: σX2\sigma_X^2 or Var(X)

Formula:

σX2=(xμX)2P(X=x)\sigma_X^2 = \sum (x - \mu_X)^2 \cdot P(X = x)

Alternative (easier calculation):

σX2=x2P(X=x)μX2\sigma_X^2 = \sum x^2 \cdot P(X = x) - \mu_X^2

Standard Deviation: σX=σX2\sigma_X = \sqrt{\sigma_X^2}

Example: X = heads in 2 flips (μ_X = 1)

σX2=(01)2(0.25)+(11)2(0.50)+(21)2(0.25)\sigma_X^2 = (0-1)^2(0.25) + (1-1)^2(0.50) + (2-1)^2(0.25) =1(0.25)+0(0.50)+1(0.25)=0.50= 1(0.25) + 0(0.50) + 1(0.25) = 0.50

σX=0.500.707\sigma_X = \sqrt{0.50} \approx 0.707

Alternative calculation:

σX2=[02(0.25)+12(0.50)+22(0.25)]12\sigma_X^2 = [0^2(0.25) + 1^2(0.50) + 2^2(0.25)] - 1^2 =[0+0.50+1]1=0.50= [0 + 0.50 + 1] - 1 = 0.50

Linear Transformations

If Y = a + bX:

μY=a+bμX\mu_Y = a + b\mu_X

σY=bσX\sigma_Y = |b|\sigma_X

Note: Adding constant shifts mean but doesn't change spread. Multiplying affects both.

Example: X = quiz score (0-10), μ_X = 7, σ_X = 2 Convert to percentage: Y = 10X

μY=10(7)=70%\mu_Y = 10(7) = 70\% σY=10(2)=20%\sigma_Y = 10(2) = 20\%

Example: Temperature conversion F = 32 + 1.8C If μ_C = 20°C, σ_C = 5°C:

μF=32+1.8(20)=68°F\mu_F = 32 + 1.8(20) = 68°F σF=1.8(5)=9°F\sigma_F = 1.8(5) = 9°F

Combining Independent Random Variables

If X and Y are independent:

Sum: Z = X + Y μZ=μX+μY\mu_Z = \mu_X + \mu_Y σZ2=σX2+σY2\sigma_Z^2 = \sigma_X^2 + \sigma_Y^2

Difference: W = X - Y μW=μXμY\mu_W = \mu_X - \mu_Y σW2=σX2+σY2\sigma_W^2 = \sigma_X^2 + \sigma_Y^2 (variances always add!)

Example: X = score on test 1 (μ = 80, σ = 10) Y = score on test 2 (μ = 75, σ = 12) Total = X + Y

μTotal=80+75=155\mu_{Total} = 80 + 75 = 155 σTotal=102+122=24415.6\sigma_{Total} = \sqrt{10^2 + 12^2} = \sqrt{244} \approx 15.6

Key: Standard deviations don't add; variances do!

Expected Value Applications

Fair game: E(winnings) = 0

Example: Pay 1,rolldie.Win1, roll die. Win 5 if roll 6, $0 otherwise.

E(net)=5(1/6)+(1)(5/6)=5/65/6=0E(net) = 5(1/6) + (-1)(5/6) = 5/6 - 5/6 = 0

Fair game!

Expected profit/loss:

Example: Lottery ticket costs 2,prize2, prize 1,000,000, probability 1/1,000,000

E(net)=999,998(1/1,000,000)+(2)(999,999/1,000,000)E(net) = 999,998(1/1,000,000) + (-2)(999,999/1,000,000) 12=$1\approx 1 - 2 = -\$1

Expect to lose $1 per ticket on average

Probability Histogram

Visual representation of probability distribution

  • X-axis: Values of X
  • Y-axis: Probabilities
  • Height of bar = P(X = x)
  • Bars don't touch (discrete)

Properties:

  • Area of bar = probability
  • Total area = 1

Common Notation

P(X = 3): Probability X equals 3
P(X ≤ 3): Probability X is at most 3 (cumulative)
P(X < 3): Probability X is less than 3
P(2 ≤ X ≤ 5): Probability X is between 2 and 5 inclusive

For discrete variables: P(X ≤ 3) includes P(X = 3)

Cumulative Distribution Function (CDF)

CDF: P(X ≤ x)

Sum probabilities up to and including x

Example: X has distribution in earlier example

P(X ≤ 1) = P(X = 0) + P(X = 1) = 0.25 + 0.50 = 0.75

Common Discrete Distributions

Binomial: Fixed trials, success/failure, constant probability
Geometric: Trials until first success
Poisson: Count of events in interval

(Each has its own topic with specific formulas!)

Common Mistakes

❌ Forgetting probabilities must sum to 1
❌ Confusing E(X) with most likely value
❌ Adding standard deviations instead of variances
❌ Forgetting absolute value for σ_Y when Y = a + bX
❌ Using variance formula when independence doesn't hold

Practice Strategy

  1. List: All possible values
  2. Find: Probability for each value
  3. Verify: Probabilities sum to 1
  4. Calculate: μ and σ using formulas
  5. Interpret: What do mean and SD tell us?

Quick Reference

Mean: μX=xP(X=x)\mu_X = \sum x \cdot P(X = x)

Variance: σX2=(xμX)2P(X=x)\sigma_X^2 = \sum (x - \mu_X)^2 \cdot P(X = x)

Linear Transform: Y = a + bX gives μ_Y = a + bμ_X, σ_Y = |b|σ_X

Sum/Difference: μ adds/subtracts, variances always add

Remember: Mean is long-run average. Standard deviation measures variability. For sums/differences of independent variables, variances add!

📚 Practice Problems

1Problem 1easy

Question:

Let X be the number of heads when flipping a fair coin 3 times. Create the probability distribution table for X and verify it is a valid probability distribution.

💡 Show Solution

Step 1: List all possible outcomes Sample space for 3 flips: HHH, HHT, HTH, HTT, THH, THT, TTH, TTT Total: 8 outcomes

Step 2: Count heads in each outcome HHH → 3 heads HHT → 2 heads HTH → 2 heads HTT → 1 head THH → 2 heads THT → 1 head TTH → 1 head TTT → 0 heads

Step 3: Create frequency distribution X = 0: 1 outcome (TTT) X = 1: 3 outcomes (HTT, THT, TTH) X = 2: 3 outcomes (HHT, HTH, THH) X = 3: 1 outcome (HHH)

Step 4: Calculate probabilities P(X = 0) = 1/8 = 0.125 P(X = 1) = 3/8 = 0.375 P(X = 2) = 3/8 = 0.375 P(X = 3) = 1/8 = 0.125

Step 5: Create probability distribution table ┌───────┬─────────┐ │ X │ P(X) │ ├───────┼─────────┤ │ 0 │ 0.125 │ │ 1 │ 0.375 │ │ 2 │ 0.375 │ │ 3 │ 0.125 │ └───────┴─────────┘

Step 6: Verify conditions for valid distribution Condition 1: All probabilities between 0 and 1? 0.125, 0.375, 0.375, 0.125 are all in [0,1] ✓

Condition 2: Sum of probabilities = 1? 0.125 + 0.375 + 0.375 + 0.125 = 1.000 ✓

This IS a valid probability distribution!

Answer: Probability Distribution: X: 0 1 2 3 P(X): 1/8 3/8 3/8 1/8

This is valid because all probabilities are between 0 and 1, and they sum to 1.

2Problem 2easy

Question:

Given the probability distribution: X: 1, 2, 3, 4 P(X): 0.1, 0.3, 0.4, 0.2

Find the expected value E(X) and interpret it.

💡 Show Solution

Step 1: Recall expected value formula E(X) = μ = Σ[x · P(x)] Sum of each value times its probability

Step 2: Calculate E(X) E(X) = (1)(0.1) + (2)(0.3) + (3)(0.4) + (4)(0.2) = 0.1 + 0.6 + 1.2 + 0.8 = 2.7

Step 3: Interpret the expected value E(X) = 2.7 is the long-run average value

This means:

  • If we repeat this random process many times
  • The average value of X will approach 2.7
  • It's the "center" of the distribution

Step 4: Important notes

  • E(X) = 2.7 is NOT a possible value of X
  • X can only be 1, 2, 3, or 4
  • But the average over many trials is 2.7
  • It's like saying "average family has 2.3 children"

Step 5: Verify calculation Check probabilities sum to 1: 0.1 + 0.3 + 0.4 + 0.2 = 1.0 ✓

Check E(X) is within range: min(X) = 1, max(X) = 4 1 ≤ 2.7 ≤ 4 ✓

Step 6: Visual interpretation The expected value is the "balance point"

P(X): 0.4│ █ 0.3│ █ █ 0.2│ █ █ █ 0.1│ █ █ █ █ └───────── 1 2 3 4

Balance point at 2.7 (closer to 3 due to higher probability there)

Answer: E(X) = 2.7

This means if we repeatedly observe this random variable, the long-run average value will be 2.7. It's the weighted average of all possible values, weighted by their probabilities.

3Problem 3medium

Question:

A game costs 5toplay.Yourolladie:ifyourolla6,youwin5 to play. You roll a die: if you roll a 6, you win 20; if you roll a 4 or 5, you win $10; otherwise you win nothing. What is the expected value of your net winnings? Should you play this game?

💡 Show Solution

Step 1: Define the random variable Let X = net winnings (winnings minus cost)

Step 2: Identify all outcomes and net winnings Roll 6: Win 20,paid20, paid 5 → Net = 2020 - 5 = 15Roll4or5:Win15 Roll 4 or 5: Win 10, paid 5Net=5 → Net = 10 - 5=5 = 5 Roll 1, 2, or 3: Win 0,paid0, paid 5 → Net = 00 - 5 = -$5

Step 3: Find probabilities P(X = 15) = P(roll 6) = 1/6 P(X = 5) = P(roll 4 or 5) = 2/6 = 1/3 P(X = -5) = P(roll 1, 2, or 3) = 3/6 = 1/2

Step 4: Verify probability distribution 1/6 + 1/3 + 1/2 = 1/6 + 2/6 + 3/6 = 6/6 = 1 ✓

Step 5: Calculate expected value E(X) = (15)(1/6) + (5)(1/3) + (-5)(1/2) = 15/6 + 5/3 - 5/2 = 15/6 + 10/6 - 15/6 = 10/6 = 5/3 ≈ $1.67

Step 6: Interpret the expected value E(X) = $1.67 per game

This means:

  • On average, you GAIN $1.67 per game
  • In the long run, you expect to profit
  • The game is in your favor!

Step 7: Should you play? YES! Expected value is POSITIVE ($1.67)

  • You expect to win money on average
  • This is a favorable game

However, consider:

  • Short-term: You could still lose (50% chance of losing $5)
  • Need many plays for average to materialize
  • Variance matters too (not just expected value)

Step 8: Calculate long-run expectation If you play 100 times: Expected total net winnings = 100 × 1.67=1.67 = 167

Answer: E(X) = $1.67 per game

You SHOULD play this game because the expected value is positive. On average, you expect to win 1.67pergame.However,individualgameshavehighvarianceyoucouldwin1.67 per game. However, individual games have high variance - you could win 15, win 5,orlose5, or lose 5 on any single play.

4Problem 4medium

Question:

Given probability distribution with E(X) = 4 and Var(X) = 3. Find: a) E(2X + 5) b) Var(2X + 5) c) SD(2X + 5)

💡 Show Solution

Step 1: Recall transformation rules for expected value E(aX + b) = a·E(X) + b

Where:

  • a is multiplied through
  • b is added at the end

Step 2: Calculate E(2X + 5) E(2X + 5) = 2·E(X) + 5 = 2(4) + 5 = 8 + 5 = 13

Step 3: Recall transformation rules for variance Var(aX + b) = a²·Var(X)

Important:

  • Multiply by a² (not just a)
  • Adding constant b doesn't affect variance!
  • Variance measures spread - shifting doesn't change spread

Step 4: Calculate Var(2X + 5) Var(2X + 5) = 2²·Var(X) = 4·Var(X) = 4(3) = 12

Step 5: Calculate SD(2X + 5) SD(2X + 5) = √Var(2X + 5) = √12 = 2√3 ≈ 3.46

Alternative: SD(aX + b) = |a|·SD(X) SD(2X + 5) = 2·SD(X) = 2·√3 = 2√3 ✓

Step 6: Summary of transformation rules For Y = aX + b:

E(Y) = a·E(X) + b Var(Y) = a²·Var(X) SD(Y) = |a|·SD(X)

Key insights:

  • Adding constant: shifts mean, doesn't affect variance
  • Multiplying: scales mean by a, scales variance by a²
  • Standard deviation scales by |a|

Step 7: Verify with original values E(X) = 4, Var(X) = 3, SD(X) = √3

Transform by 2X + 5:

  • Mean shifts from 4 to 13 (doubled then added 5)
  • Variance scales from 3 to 12 (multiplied by 2² = 4)
  • SD scales from √3 to 2√3 (multiplied by 2)

Answer: a) E(2X + 5) = 13 b) Var(2X + 5) = 12 c) SD(2X + 5) = 2√3 ≈ 3.46

Key insight: Adding a constant affects the mean but not the variance or SD. Multiplying by a constant affects both, but variance is multiplied by a² while SD is multiplied by |a|.

5Problem 5hard

Question:

You and a friend independently roll a fair die. Let X be the result of your roll and Y be the result of your friend's roll. Find E(X + Y), Var(X + Y), and E(XY). Are the answers different than for a single die?

💡 Show Solution

Step 1: Find E(X) and E(Y) for single die For fair die: X can be 1, 2, 3, 4, 5, 6 Each with probability 1/6

E(X) = (1 + 2 + 3 + 4 + 5 + 6)/6 = 21/6 = 3.5

Since Y is also a die roll: E(Y) = 3.5

Step 2: Find E(X + Y) For independent random variables: E(X + Y) = E(X) + E(Y)

E(X + Y) = 3.5 + 3.5 = 7

This is the expected sum of two dice!

Step 3: Find Var(X) First calculate E(X²): E(X²) = (1² + 2² + 3² + 4² + 5² + 6²)/6 = (1 + 4 + 9 + 16 + 25 + 36)/6 = 91/6

Var(X) = E(X²) - [E(X)]² = 91/6 - (3.5)² = 91/6 - 12.25 = 91/6 - 49/4 = 182/12 - 147/12 = 35/12 ≈ 2.917

Step 4: Find Var(X + Y) For INDEPENDENT random variables: Var(X + Y) = Var(X) + Var(Y)

Since X and Y are independent and identical: Var(Y) = 35/12

Var(X + Y) = 35/12 + 35/12 = 70/12 = 35/6 ≈ 5.833

Note: If NOT independent, would need Cov(X,Y) term

Step 5: Find E(XY) For INDEPENDENT random variables: E(XY) = E(X) · E(Y)

E(XY) = 3.5 × 3.5 = 12.25

Step 6: Compare to single die Single die:

  • E(X) = 3.5
  • Var(X) = 35/12 ≈ 2.917

Sum of two dice:

  • E(X + Y) = 7 (double the mean!)
  • Var(X + Y) = 35/6 ≈ 5.833 (double the variance!)
  • SD(X + Y) = √(35/6) ≈ 2.415 (NOT double!)

Product of two dice:

  • E(XY) = 12.25

Step 7: Key formulas used For independent X and Y: ✓ E(X + Y) = E(X) + E(Y) ✓ Var(X + Y) = Var(X) + Var(Y) ✓ E(XY) = E(X) · E(Y)

Answer: E(X + Y) = 7 (expected sum is 7) Var(X + Y) = 35/6 ≈ 5.833 E(XY) = 12.25 (expected product is 12.25)

Yes, these are different! The sum has double the expected value and double the variance of a single die. The product's expected value is the product of the individual expected values (because independent).