Lesson 4 — Variance & Distributions

Seeing (data) is
believing (data)

Why the mean alone can fool you — and how spread, shape, and histograms tell the full story.

The mean can fool you

A concert seller advertises an average ticket price of $20. But best seats are $100, worst are $5 — the mean hides the spread!

Mean alone

$20

Sounds affordable

With spread

$5 — $100

Very different picture

Variance

How much do data points differ from the mean? Variance measures the average squared distance from the mean.

1Calculate the mean

2Subtract mean from each point & square it

3Add them up & divide by n−1

Example: heights [2,4,6,8,10] → mean=6 → variance = 10

Standard deviation

The square root of variance. Easier to interpret: the “typical distance” a value sits from the mean.

From our example

√10 ≈ 3.16

SD tells us typical spread

Golden retrievers

Mean 24″, SD 2″

Most dogs: 22″–26″

Rhonda at 30″ is 3 SD above the mean — only 0.1% of dogs are that tall!

The normal distribution

Many real-world variables follow a bell-shaped curve. Most values cluster near the mean; fewer appear at the extremes.

68%Within 1 SD of the mean

95%Within 2 SD of the mean

99.7%Within 3 SD of the mean

Height, test scores, plant growth — all tend to follow a normal distribution.

Not all distributions are normal

Data can be skewed or have multiple peaks. Shape matters for choosing the right test.

◕SymmetricMean = Median
Classic bell curve

◢Right-skewedLong tail to the right
Mean > Median

◣Left-skewedLong tail to the left
Mean < Median

Visualization tool

Histograms

Bars show how many data points fall in each range. Histograms reveal the shape of your data — symmetric, skewed, bimodal — at a glance.

The McDonald’s Times Square example: Thursday traffic has two peaks (lunch + dinner), Saturday’s is closer to normal. Same data, very different shapes!