2

Currently in a probability class and am having a hard time wrapping my head around the standard deviation. In many ways it feels "made up" to me; let me explain:

$$\sqrt{a^2 + b^2} \neq a + b$$

So unless it's used relative to other standard deviations, I feel like it wouldn't inherently mean anything. By that I mean, measurements like meters/inches have some grounding in "reality." Knowing something has a length of 3 m is useful information even if I don't have any other lengths in meters to compare it to. But when I see that a distribution has a SD of 2, that information only translates to my understanding of how "spread" the distribution is when I compare it to other distributions with different SDs. I think, "Oh distribution 1 is pretty spread out and it has an SD of ___ so this other SD must mean the corresponding distribution 2 is less/more spread out." Is this just how it is? Are some units of measurement just less intuitive and derive meaning from comparison?

To me, it would make more sense to take the mean of the absolute values of all the deviations from the mean, but my professor told me that SD just happens to be more useful. I don't doubt this, and I am aware of the many properties that makes SD more useful, but my question is why? What causes SD to have those properties?

For example, what causes most of the points in a normal distribution to land within 2 SD of the mean? It can't be just a coincidence or happenstance; there must be a reason.

Any help would be appreciated. Thanks in advance!

  • 1
    Are you familiar with the Central Limit Theorem? – aschepler Apr 05 '24 at 22:20
  • 3
    Probably the most important property is that the variance of the sum of some independent random variables is the sum of the variances. By the way, standard deviation can have units just like anything else. It has the same units as the data do. – Ian Apr 05 '24 at 22:23
  • 2
    The Standard Deviation is the expected Euclidean distance to the mean. It is more useful because we are more used and familiar with Euclidean distances than with city-block distances (i.e., sums of absolute values). – William M. Apr 05 '24 at 22:27
  • If you want something additive (in the uncorrelated case) go with variance, but it's not of the same dimension as the mean, so the standard deviation, which is, lets us quantify how far from the mean we are. You may also be interested in answers to this question. I wrote one, which interprets SD as a length, making variance a squared length. – J.G. Apr 05 '24 at 22:44
  • My impression is that standard deviation is preferred to mean absolute deviation (as you describe), for instance, exclusively because of its nice mathematical properties. Mean absolute deviation will also measure how "spread out" a distribution is, and most of the points in a normal distribution will fall within a certain number of mean absolute deviations of the mean. I've asked a few statisticians, but none of them has been able to help me get a good intuitive grasp on the difference between standard deviation and mean absolute deviation in terms of their real-world meanings. – TomKern Apr 05 '24 at 22:48
  • As for the 68-95-99.7 rule for Normal distributions, you could get something analogous for them using the absolute deviation, to give your proposed alternative its technical name. (Unfortunately, neither it nor its square is additive for uncorrelated variables.) But the SD's properties aren't just more useful, they're also more tractable. – J.G. Apr 05 '24 at 22:49
  • 2
    What exactly does $\sqrt{a^2 + b^2} \neq a + b$ have to do with any of this and why does it make standard deviation seem "made up" to you? It sounds like you are only seeing bits and pieces of what standard deviation is about, kind of like looking at the pieces of a puzzle without having the picture on the box. – David K Apr 05 '24 at 23:21
  • @WilliamM. The standard deviation is a sort of analogue of the Euclidean distance (between a random variable and the identically zero random variable). But it's not the expected Euclidean distance to the mean. In 1D that would be the mean absolute deviation again. – Ian Apr 06 '24 at 14:02
  • @Ian The std deviation of a random variable is $\sqrt{E((X-\mu_X)^2)}$ which is therefore the square root of the expected Euclidean distance squared. When you take a sample of $X,$ then the sample standard deviation is $1/\sqrt{n}$ the Euclidean distance to the sample mean. – William M. Apr 07 '24 at 18:12
  • @WilliamM. You changed it from what I was responding to before. It's not the expected Euclidean distance; it's the square root of the expected squared Euclidean distance. That seemingly weird order of operations (square inside the expectation, take the square root outside) is in some sense the whole point of the question. – Ian Apr 07 '24 at 18:55
  • @Ian I thought the point of the question was understanding why SD is used. The answer is because it mimics Euclidean distance, and it is, up to a constant, the average of such distances. – William M. Apr 08 '24 at 15:04

0 Answers0