Lesson 11 — Statistics in the Real World

With great power
comes… messy data.

The real-world considerations that make or break a statistical analysis.

Data takes work

The statistical test is often the easy part. Getting clean, representative data is where most time goes.

📋DesignPlan the experiment carefully

📊CollectGather data systematically

✏CleanRemove outliers, fix errors

🔎AnalyzeFinally: run the test!

Real data rarely meets all assumptions perfectly. That’s okay — if you’re transparent about it.

AcceptableDistribution is slightly non-normal but close enough — use Pearson’s and note the limitation.

Needs attentionVery non-normal or bimodal data — use a non-parametric test instead.

📈Small sampleMay miss real effects
(low power)

📉Large sampleMore reliable results
(high power)

💬Power calculationPlan before collecting data to know how many observations you need

A low-powered study might find no effect even when a real one exists!

When your data violates normality, use these distribution-free alternatives:

A small p-value means a result is unlikely by chance — not that it’s meaningful.

p < 0.001Statistically significant — unlikely to be chance

d = 0.02Effect size — tiny practical impact (e.g. drug reduces blood sugar by 1 mg/dl)

Always report effect size alongside p-values for real-world decisions.

The final word

“All models are wrong, some are useful.”

— George Box, British statistician

One result doesn’t establish a pattern. Consider context, replication, and limitations — then unlock the hidden patterns behind everything!