Cross-correlation for finite time durations

Question

Consider two real-valued data signals $x(t)$ and $y(t)$, which are independent and uncorrelated. Let's say that each sample of $x(t)$ and $y(t)$ are both drawn from white-noise distributions with means $\mu_x, \mu_y$, and standard deviations $\sigma_x, \sigma_y$.

I'm interested in the cross-correlation of $x$ and $y$ over some finite duration of time $T$ with zero lag, so lets define the cross-correlation (with zero lag) as, $$\rho_{xy} \equiv \frac{1}{T} \int_{-T/2}^{+T/2} x(t) \, y(t) \, dt$$

(1) I'm pretty sure that $\lim_{T\rightarrow \infty} \left[ \rho_{xy} \right] = \mu_x \, \mu_y$.
(2) My intuition is that if these are continuously defined, and perfectly white noise, then $\rho_{xy} = \mu_x \, \mu_y$ exactly, even for finite $T$, because there is an arbitrary amount of structure at arbitrarily small time-intervals. In other words the standard deviation $\sigma_\rho = 0$. Is that correct?

I'm actually more interested in discretely sampled signals, $x_i \equiv x(t_i)$ [and the same for $y$]. Let's say there are $N$ samples in the interval $T$. Then, in this case the cross correlation is, $$c_{xy} \equiv \sum_{i=1}^{N} x_i \, y_i $$ (3) I'm pretty sure that $\lim_{N\rightarrow \infty} \left[ c_{xy} \right] = \mu_x \, \mu_y$. [This also seems to bolster the idea of (2).] (4) It should be the case that for any finite $N$, the ensemble average $\langle c_{xy} \rangle = \mu_x \, \mu_y$, but any 'realization' of $c_{xy}$ will deviate from this average. Now, the main question is, for a finite number of samples, what is the standard deviation of the discretely samples cross-correlation $\sigma_c$? Because there are only a finite number of samples, then the positive and negative correlations can't perfectly cancel out, so we should be left with some residual. How is that calculated?

[While my situation has distributions for $x(t)$ and $y(t)$ as described above, I would be interested in relatively straight-forward generalization to arbitrary distributions $p_x, p_y$ also]

Thanks!

score 1 · Accepted Answer · answered Oct 22 '24 at 12:22

1

Assuming everything independent and $X_i$ with expectation $\mu_X$ and variance $\sigma^2_X$ and $Y_i$ with expectation $\mu_Y$ and variance $\sigma^2_Y$, then $X_iY_i$ does not have a normal distribution even if $X_i$ and $Y_i$ do,

but you can say the expectation of $X_iY_i$ is $\mu_X\mu_Y$ and its variance is $\mu^2_X \sigma^2_Y +\mu^2_Y \sigma^2_X + \sigma^2_X \sigma^2_Y$,

so the sample mean $\frac1n\sum X_iY_i$ has expectation $\mu_X\mu_Y$ and variance $\frac1n\left(\mu^2_X \sigma^2_Y +\mu^2_Y \sigma^2_X + \sigma^2_X \sigma^2_Y\right)$ with standard error of the sample mean the square root of that variance.

answered Oct 22 '24 at 12:22

Henry

169,616

Thanks! Is it easy to understand where this expression for the variance comes from? – DilithiumMatrix Oct 23 '24 at 15:18
1

@DilithiumMatrix It comes from $\textrm{Var}(X_iY_i) $ $= E[(X_iY_i)^2] - (E[X_iY_i])^2 $ $= E[X_i^2]E[Y_i^2] - (E[X_i])^2(E[Y_i])^2 $ $= (\sigma^2_X+\mu_X^2)(\sigma^2_Y+\mu_X^2) - \mu_X^2\mu_Y^2$ with the second equality using independence – Henry Oct 23 '24 at 15:33

Cross-correlation for finite time durations

1 Answers1