Lesson 4 — Variance & Distributions

Seeing (data) is
believing (data)

Why the mean alone can fool you — and how spread, shape, and histograms tell the full story.

The mean can fool you

A concert seller advertises an average ticket price of $20. But best seats are $100, worst are $5 — the mean hides the spread!

Mean alone
$20
Sounds affordable
With spread
$5 — $100
Very different picture

Variance

How much do data points differ from the mean? Variance measures the average squared distance from the mean.

1Calculate the mean
2Subtract mean from each point & square it
3Add them up & divide by n−1

Example: heights [2,4,6,8,10] → mean=6 → variance = 10

Standard deviation

The square root of variance. Easier to interpret: the “typical distance” a value sits from the mean.

From our example
√10 ≈ 3.16
SD tells us typical spread
Golden retrievers
Mean 24″, SD 2″
Most dogs: 22″–26″

Rhonda at 30″ is 3 SD above the mean — only 0.1% of dogs are that tall!

The normal distribution

Many real-world variables follow a bell-shaped curve. Most values cluster near the mean; fewer appear at the extremes.

68%Within 1 SD of the mean
95%Within 2 SD of the mean
99.7%Within 3 SD of the mean

Height, test scores, plant growth — all tend to follow a normal distribution.

Not all distributions are normal

Data can be skewed or have multiple peaks. Shape matters for choosing the right test.

SymmetricMean = Median
Classic bell curve
Right-skewedLong tail to the right
Mean > Median
Left-skewedLong tail to the left
Mean < Median

Visualization tool

Histograms

Bars show how many data points fall in each range. Histograms reveal the shape of your data — symmetric, skewed, bimodal — at a glance.

The McDonald’s Times Square example: Thursday traffic has two peaks (lunch + dinner), Saturday’s is closer to normal. Same data, very different shapes!

1 / 7