Chi-Square Tests

Perform chi-square tests for goodness of fit, homogeneity, and independence.

🎯⭐ INTERACTIVE LESSON

Try the Interactive Version!

Learn step-by-step with practice exercises built right in.

Start Interactive Lesson →

Chi-Square Tests

Overview

Chi-square (χ2\chi^2) tests are used for categorical data. There are three types:

| Test | Purpose | |------|---------| | Goodness of Fit | Does a distribution match a claimed distribution? | | Homogeneity | Do different populations have the same distribution? | | Independence | Are two categorical variables independent? |

Chi-Square Test Statistic

χ2=(ObservedExpected)2Expected\chi^2 = \sum \frac{(\text{Observed} - \text{Expected})^2}{\text{Expected}}

Always calculated the same way for all three tests.

1. Goodness of Fit Test

Purpose: Test whether a categorical variable follows a specified distribution.

Hypotheses:

  • H0H_0: The data follows the specified distribution
  • HaH_a: The data does not follow the specified distribution

Degrees of freedom: df=k1df = k - 1 (where kk = number of categories)

Expected counts: Ei=npiE_i = n \cdot p_i (sample size × hypothesized proportion)

2. Test for Homogeneity

Purpose: Test whether the distribution of a categorical variable is the same across different populations.

Hypotheses:

  • H0H_0: The distributions are the same for all populations
  • HaH_a: The distributions are not all the same

Degrees of freedom: df=(r1)(c1)df = (r-1)(c-1)

Expected counts: E=row total×column totalgrand totalE = \frac{\text{row total} \times \text{column total}}{\text{grand total}}

3. Test for Independence

Purpose: Test whether two categorical variables are independent within a single population.

Hypotheses:

  • H0H_0: The two variables are independent
  • HaH_a: The two variables are not independent

Degrees of freedom: df=(r1)(c1)df = (r-1)(c-1)

Expected counts: Same formula as homogeneity

Conditions for All Chi-Square Tests

  1. Random: Data from a random sample or randomized experiment
  2. 10%: n<0.10Nn < 0.10N
  3. Large Counts: All expected counts 5\geq 5

Properties of the Chi-Square Distribution

  • Always 0\geq 0
  • Right-skewed (becomes less skewed as dfdf increases)
  • P-value is always from the right tail
  • Different shape for each dfdf

Follow-Up Analysis

If you reject H0H_0, identify which cells contribute most to χ2\chi^2 by examining: (OE)2E\frac{(O - E)^2}{E} for each cell. Large contributions indicate where the biggest discrepancies are.

AP Tip: Chi-square tests are ALWAYS right-tailed. There is no "left-tailed" or "two-tailed" chi-square test. Show expected counts and verify they are all ≥ 5.

📚 Practice Problems

No example problems available yet.