2

First, this is not homework, it is actually for work. It's been a couple of years since I've done stats and need some help! I've googled for this problem but was unavailable to find any resources that could help answer my question.

I have 25 values:

11.5
11.6
11.9
12.2
12.4
12.4
12.5
12.5
12.5
12.8
12.8
12.9
13.1
13.3
13.5
13.5
13.7
13.7
13.8
13.9
13.9
14
14.3
14.5
15

From here, I calculate the mean and from that, the variance and then the standard deviation:

The variance formula and my variance calculations:

$$ \sigma^{2} =\frac{\sum_{i=1}^{n}(x_{i}-\mu )^{2}}{n}=\frac{\sum_{i=1}^{25}(x_{i}-13.128)^{2}}{25}=0.7996159999999999 $$

Of course, standard deviation is simply the square root of variance:

$$ \sigma =\sqrt{0.7996159999999999}=0.8942125027083886 $$

Here's where I feel like I'm messing up:

One standard deviation less than the mean:

$$ -\sigma + \mu = -0.8942125027083886 + 13.128 = 12.2337874972916114 $$

Two standard deviations less than the mean:

$$ -2\sigma + \mu = -2*0.8942125027083886 + 13.128 = 11.3395749945832228 $$

This value, 11.3395749945832228, falls below the smallest value in the array, 11.5.


How is this possible? Where am I messing up my calculations? Thank you for any and all help! I really appreciate it.

  • Your result seems entirely correct. It only shows that the random variable use are testing could occasionally fall below the value you mention. You can even calculate the event of this happening by using the value of a certain defined integral of the Gauss curve having mean $\mu$ and standard deviation $\sigma$ – Marc Bogaerts Sep 09 '14 at 15:32

1 Answers1

3

EDIT: if there is some expensive decision riding on this, consider trying to get funding for an hour of consulting from a graduate student or professor in statistics there. These issues are always about interpretation, and need the hand of a master. The last time I saw something like this, it was a medical doctor doing a study on lung cancer or the like, but she was in an academic department, and her university statistics department had a very clear setup and fees for consulting to other university departments, all on their website.

ORIGINAL: Let there be four data points, $$ -1,-1,1,1. $$ The mean is $0.$ Sum of squares (after subtracting $0$) is $4,$ number of data points is $4,$ so variance and standard deviation are $1.$ Two standard deviations misses all the data points.

The only guaranteed thing is Chebyshev's inequality, which can be used either by assuming a reasonable governing probability distribution or by taking the set of data points as defining the distribution, which is what you are doing.

Very similar to my first example: take some large number $100,$ place one data point at $0,$ but then place $100$ data points at $1$ and $100$ data points at $-1.$ The standard deviation comes out just under $1,$ mean $0,$ so all but one data point lie outside a single standard deviation, while everything lies inside $1.0025$ deviations. Right, $\sigma^2 = 200/201 \approx 0.995, \; \; \sigma \approx 0.9975,$ reciprocal $\approx1.002496883$ gets us to exactly $1,$ so everything is inside $1.0025$ deviations.

Will Jagy
  • 146,052