Key Takeaways
- Standard deviation measures how spread out data is from the mean
- A low standard deviation means data points are close to the mean
- A high standard deviation means data points are spread out
- Use sample SD (N-1) when analyzing a subset of a larger population
- Use population SD (N) when you have data for the entire population
What Is Standard Deviation?
Standard deviation is a statistical measure that quantifies the amount of variation or dispersion in a set of data values. A low standard deviation indicates that data points tend to be close to the mean, while a high standard deviation indicates that data points are spread out over a wider range of values.
Standard deviation is one of the most commonly used measures of spread in statistics, used in everything from quality control in manufacturing to risk assessment in finance.
The Standard Deviation Formula
Population SD: sigma = sqrt(sum((x - mu)^2) / N)
Sample SD: s = sqrt(sum((x - x-bar)^2) / (N-1))
How to Calculate Standard Deviation (Step-by-Step)
Calculate the Mean
Add all data values together and divide by the count. For example: (5+10+15+20+25)/5 = 15
Find Each Deviation
Subtract the mean from each data point: 5-15=-10, 10-15=-5, 15-15=0, 20-15=5, 25-15=10
Square Each Deviation
Square each deviation: 100, 25, 0, 25, 100
Calculate Variance
Find the average of squared deviations: (100+25+0+25+100)/5 = 50 (population variance)
Take the Square Root
Standard deviation = sqrt(50) = 7.07
Population vs Sample Standard Deviation
The key difference between population and sample standard deviation is in the denominator of the variance calculation:
- Population SD: Divide by N (total count). Use when you have data for the entire population.
- Sample SD: Divide by N-1 (Bessel's correction). Use when analyzing a sample from a larger population. This correction accounts for the bias in estimating population variance from a sample.
Interpreting Standard Deviation
For data that follows a normal distribution (bell curve), the standard deviation helps predict where values fall:
- 68% of values fall within 1 standard deviation of the mean
- 95% of values fall within 2 standard deviations of the mean
- 99.7% of values fall within 3 standard deviations of the mean
Frequently Asked Questions
Use population standard deviation when you have data for the entire group you're interested in (e.g., test scores of all students in a class). Use sample standard deviation when you're analyzing a subset to make inferences about a larger population (e.g., surveying 100 customers to understand all customers).
A standard deviation of 0 means all values in your dataset are identical. There is no variation or spread in the data - every single data point equals the mean.
No, standard deviation cannot be negative. Since we square the deviations (which makes them all positive) and then take the square root, the result is always zero or positive. The minimum value is 0 (when all values are the same).
Variance is the square of standard deviation, or conversely, standard deviation is the square root of variance. Variance = SD^2, and SD = sqrt(Variance). Variance is useful for mathematical calculations, while standard deviation is easier to interpret because it's in the same units as the original data.