2

I found this link about incremental standard deviation where it computes the standard deviation every time a new element was added to the dataset.

Is there a similar method when adding a new value and removing the first value from the dataset?

For example,

[45, 26, 78, 45, 34, 56] - initial data set 

[26, 78, 45, 34, 56, 74 *(new data)*] - first value 45 was removed
Krissy
  • 21
  • 2

1 Answers1

1

The formula at the link is a special case of the sample variance decomposition formula given in O'Neill (2014) (Result 1). Your problem involves a double-application of this formula.

Denoting the data-sets used in an obvious way, your goal is to write $s_{2:n+1}^2$ in terms of $s_{1:n}^2$ and any other necessary sample quantities. To do this we can use sample variance decompositions:

$$s_{2:n+1}^2 = \frac{n-2}{n-1} s_{2:n}^2 + \frac{1}{n} (\bar{x}_{2:n} - x_{n+1})^2,$$

$$s_{1:n}^2 = \frac{n-2}{n-1} s_{2:n}^2 + \frac{1}{n} (\bar{x}_{2:n} - x_{1})^2.$$

Hence, we have:

$$\begin{equation} \begin{aligned} s_{2:n+1}^2 &= \frac{n-2}{n-1} s_{2:n}^2 + \frac{1}{n} (\bar{x}_{2:n} - x_{n+1})^2 \\[6pt] &= s_{1:n}^2 - \frac{1}{n} (\bar{x}_{2:n} - x_{1})^2 + \frac{1}{n} (\bar{x}_{2:n} - x_{n+1})^2 \\[6pt] &= s_{1:n}^2 - \frac{1}{n} \Big[ (\bar{x}_{2:n} - x_{1})^2 - (\bar{x}_{2:n} - x_{n+1})^2 \Big] \\[6pt] &= s_{1:n}^2 - \frac{1}{n} \Big[ (\bar{x}_{2:n}^2 -2 \bar{x}_{2:n} x_{1} + x_{1}^2) - (\bar{x}_{2:n}^2 -2 \bar{x}_{2:n} x_{n+1} + x_{n+1}^2) \Big] \\[6pt] &= s_{1:n}^2 - \frac{1}{n} \Big[ -2 \bar{x}_{2:n} x_{1} + x_{1}^2 + 2 \bar{x}_{2:n} x_{n+1} - x_{n+1}^2 \Big] \\[6pt] &= s_{1:n}^2 - \frac{1}{n} \Big[ 2 \bar{x}_{2:n} (x_{n+1} - x_{1}) + (x_{1}^2 - x_{n+1}^2) \Big]. \\[6pt] \end{aligned} \end{equation}$$

Ben
  • 4,494