Calculate average and variance of measured data

Question

For use in a kalman filter I need some measured data of my dynamic system. As this measurement is prone to errors (like a measurement error with $\sigma_\mathrm{M}$ as well as noise in the value to measure) I can do it quickly a few times and calculate the average and the variance. Although this looks very simple, it is getting more complicated when we do that in a real world environment due to the errors:
Assuming that I did three measurements, all had (by bad luck?) exactly the same value, my average would be that value and the variance would be zero. So I know that this is wrong as I know that my measurement device is only as good as $\sigma_\mathrm{M}$ and thus the variance that I use for the kalman can not be smaller than $\sigma_\mathrm{M}$.

Here I've got two questions now:

Knowing my measurement system very well and that its error is $\sigma_\mathrm{M}$ and this error is following a normal distribution, how do I calculate the average and variance from my samples?
Assuming that my measurement system works perfectly (no measurement error itself), but that its result is only coarsely quantisised (the real value of the number to measure is e.g. in the range 20...40, like 23.456, but the displayed number is allays rounded to the next full number, like 23 in this case). How do I calculate the average and variance from my samples in this case?

I recon using a Bayesian method would be the answer here, but I have troubles sorting it out myself.
What I also find astonishing is that this question is such a basic and fundamental one for applied measuring and I couldn't find any answer for that on google.

score 1 · Answer 1 · answered Sep 08 '20 at 15:16

1

There is a recursive formula for the sample variance, which can be found here. You can initialize it to $\sigma^2_M$ if you already know that that is the correct value or close to it. As you collect more observations, the effect of the initial value will be attenuated.

If there's no measurement error, I'm not sure why you need a variance computation, but if the displayed value is within 0.5 of the true value, you can use that as a crude estimate of the standard error (like $\sigma_M = 0.25$). The Kalman filter is all based on normality assumptions anyway, so in a real system you will always be working with an approximation.

answered Sep 08 '20 at 15:16

sven svenson

1,440

What you are describing is a "hands on" approach - not bad when you need a quick solution, but also not necessarily correct. But mathematics, especially by using a bayesian approach, should be able to do much more exact than e.g. initializing an algorithm with a number that was never intended to be used there. – Chris Sep 11 '20 at 08:18
For question #2: due to the quanitsation I've got no information where inside of the interval the real value is located, so I must assume a uniform probability distribution between the lower and upper limit. In the integer case (round to next full number) thus the variance would be 1/12 for one measurement. By taking many measurements the resulting probability distribution will be a beta distribution. But due to the central limit theorem a normal distribution shouldn't be too far off. – Chris Sep 11 '20 at 08:30
1

To my mind, Bayesian approaches are fundamentally based on the idea of initializing with a number that is meant to be updated later. Furthermore, the derivation of the Kalman filtering equations assumes normality everywhere, so in a real application you will be making approximations no matter what. That said, I think it should be possible to derive a version where the variance is unknown and you also learn it (your state then follows a multivariate t-distribution), but in practice it does the same thing of updating a prior belief on the variance. – sven svenson Sep 11 '20 at 12:54

Calculate average and variance of measured data

1 Answers1