Discrete Random Variables

Master discrete random variables, PMFs, CDFs, and their properties with Python implementations.

23 min read
Intermediate

Introduction

Random variables transform outcomes into numbers, enabling us to apply mathematical analysis to probability. They're the bridge between abstract sample spaces and practical calculations.

Learning Objectives:

  • Understand random variables as functions
  • Work with probability mass functions (PMFs)
  • Calculate probabilities using cumulative distribution functions (CDFs)
  • Implement random variables in Python

Random Variables

A random variable is a function X:ฮฉโ†’RX: \Omega \rightarrow \mathbb{R} that maps each outcome in the sample space to a real number.

Notation: Uppercase letters (XX, YY) for random variables, lowercase (xx, yy) for specific values.

Types:

  • Discrete: Takes countable values (integers, finite set)
  • Continuous: Takes values in an interval

Example: Die Roll

Sample space: ฮฉ={โš€,โš,โš‚,โšƒ,โš„,โš…}\Omega = \{โš€, โš, โš‚, โšƒ, โš„, โš…\}

Define XX = "number showing":

  • X(โš€)=1X(โš€) = 1
  • X(โš)=2X(โš) = 2
  • ...
  • X(โš…)=6X(โš…) = 6

Now we can ask: P(X=4)P(X = 4), P(X>3)P(X > 3), E[X]E[X], etc.

Probability Mass Function

For discrete random variable XX, the PMF is:

pX(x)=P(X=x)p_X(x) = P(X = x)

Properties:

  1. pX(x)โ‰ฅ0p_X(x) \geq 0 for all xx
  2. โˆ‘xpX(x)=1\sum_{x} p_X(x) = 1 (sum over all possible values)
python
import numpy as np
import matplotlib.pyplot as plt

# Die roll: PMF
values = [1, 2, 3, 4, 5, 6]
pmf = [1/6] * 6

print("PMF for fair die:")
for x, p in zip(values, pmf):
    print(f"  P(X = {x}) = {p:.4f}")

print(f"\nSum of PMF: {sum(pmf):.4f} (must equal 1)")

# Verify with simulation
rolls = np.random.randint(1, 7, size=10000)
for x in values:
    empirical_prob = np.mean(rolls == x)
    theoretical_prob = 1/6
    print(f"X={x}: empirical={empirical_prob:.4f}, theoretical={theoretical_prob:.4f}")

Cumulative Distribution Function

The CDF gives the probability that XX is at most xx:

FX(x)=P(Xโ‰คx)=โˆ‘tโ‰คxpX(t)F_X(x) = P(X \leq x) = \sum_{t \leq x} p_X(t)

Properties:

  1. FXF_X is non-decreasing
  2. limโกxโ†’โˆ’โˆžFX(x)=0\lim_{x \to -\infty} F_X(x) = 0
  3. limโกxโ†’โˆžFX(x)=1\lim_{x \to \infty} F_X(x) = 1
  4. P(a<Xโ‰คb)=FX(b)โˆ’FX(a)P(a < X \leq b) = F_X(b) - F_X(a)
python
import numpy as np

# Die roll CDF
def die_cdf(x):
    """CDF for fair 6-sided die"""
    if x < 1:
        return 0
    elif x >= 6:
        return 1
    else:
        return int(np.floor(x)) / 6

# Compute CDF at various points
test_points = [0, 1.5, 3, 3.7, 6, 10]
print("CDF values:")
for x in test_points:
    cdf_val = die_cdf(x)
    print(f"  F_X({x}) = P(X โ‰ค {x}) = {cdf_val:.4f}")

# Calculate interval probability
a, b = 2, 5
prob_interval = die_cdf(b) - die_cdf(a-1)  # P(2 โ‰ค X โ‰ค 5)
print(f"\nP({a} โ‰ค X โ‰ค {b}) = F_X({b}) - F_X({a-1}) = {prob_interval:.4f}")

Key Takeaways

  1. Random variable: Function from outcomes to numbers
  2. PMF: pX(x)=P(X=x)p_X(x) = P(X = x) for discrete RVs
  3. CDF: FX(x)=P(Xโ‰คx)F_X(x) = P(X \leq x), works for any RV
  4. PMF sums to 1: โˆ‘xpX(x)=1\sum_x p_X(x) = 1
  5. CDF properties: Non-decreasing, limits to 0 and 1

Next Lesson: Common discrete distributions (Bernoulli, Binomial, Poisson)!

Concept
Discrete Formula
Example
PMF$p_X(x) = P(X=x)$Die: $p_X(3) = 1/6$
CDF$F_X(x) = \sum_{t \leq x} p_X(t)$Die: $F_X(3) = 1/2$
Interval$P(a < X \leq b) = F_X(b) - F_X(a)$$P(2 < X \leq 5) = F_X(5) - F_X(2)$