Lesson 9 — Chi-square

χ² marks the spot

When your data is categorical — counts and frequencies instead of measurements — chi-square is the test to use.

What’s different here?

📈T-test / ANOVAContinuous variable
(height in inches)
χ²Chi-squareCategorical variable
(catches frisbee: yes / no)

Chi-square compares frequencies / counts across groups, not means.

Two types of chi-square

Goodness of fitOne variableAre there more golden retrievers than pugs at the park? (Observed vs expected)
Test of independenceTwo variablesIs frisbee-catching ability related to dog breed? (Are they associated?)

The frisbee example

Golden RetrieversPugsSt. Bernards
Catches frisbee521218
Drops frisbee84842

Golden retrievers catch frisbees at a much higher rate. But is this statistically significant? That’s what χ² tests!

The χ² statistic

Measures the difference between observed frequencies and what we’d expect if there were no association.

If p < 0.05 → reject H₀
Evidence that frisbee-catching ability is associated with breed. The test tells you a difference exists — but not which group is better (you need post-hoc for that).

Post-hoc for chi-square

Just like ANOVA, chi-square only tells you a difference exists. Post-hoc pairwise tests tell you which groups differ.

🔎 After a significant χ² result: compare golden retrievers vs pugs, golden retrievers vs St. Bernards, and pugs vs St. Bernards separately to find out who catches more frisbees.

Real-world uses

Chi-square in the wild

🌿 Fertilized vs unfertilized plant color 🏠 Car types at competing dealerships 💊 Drug side-effect rates by group 📹 Social media engagement by post type

Any time you’re counting things in categories — chi-square is your test.

1 / 7