Understanding the definition of the covariance operator

Question

Let $\mathbb H$ be an arbitrary separable Hilbert space. The covariance operator $C:\mathbb H\to\mathbb H$ between two $\mathbb H$-valued zero mean random elements $X$ and $Y$ with $\operatorname E\|X\|^2<\infty$ and $\operatorname E\|Y\|^2<\infty$ is defined by $$ C(h)=\operatorname E[\langle X,h\rangle Y] $$ for each $h\in\mathbb H$. Why is the covariance operator defined in this way?

It seems that we can arrive at this definition if we try to generalise the definition of the covariance matrix. Suppose for the moment that $\mathbb H=\mathbb R^n$. Then the covariance matrix is given by $\operatorname E[XY^T]$ which is a bounded linear operator from $\mathbb R^n$ to $\mathbb R^n$. Also, we have that $$ \operatorname E[YX^T](h) =\operatorname E[YX^Th] =\operatorname E[Y\langle X,h\rangle] =\operatorname E[\langle X,h\rangle Y] $$ for each $h\in\mathbb R^n$. The rightmost expression only depends on the inner product $\langle\cdot,\cdot\rangle$ and maybe we can say that this is the definition of the covariance operator for the arbitrary $\mathbb H$. Is this the right intuition behind the definition of the covariance operator?

Any help is much appreciated!

score 9 · Answer 1 · answered Mar 07 '16 at 15:57

9

Your line of thought is one of the possible intuitions. One another is the following: Note that a matrix does not only represent a linear operator but also a bilinear form. We define the covariance form of $X,Y \colon \Omega \to H$ by $\def\(#1){\left<#1\right>}$ $$ c(h,k) = \mathbf E [\(X,h)\(Y,k)] $$ that is $$ c(h,k) = \mathrm{cov}\bigl(\(X,h), \(Y,k)\bigr) $$ we reduce it to the covariance of real - mean zero - random variables. This is a bounded, bilinear form. By the Riesz representation theorem, $c$ corresponds to an unique linear operator $C \colon H \to H$ by $$ \(Ch,k) = c(h,k), $$ Then we have, $$ \(Ch, k) = \mathbf E [\(X,h)\(Y,k)] = \({\mathbf E [\(X,h)Y]}, k ) $$ so $$ Ch = \mathbf E [\(X,h)Y]. $$

answered Mar 07 '16 at 15:57

martini

86,011

2

*sesquilinear form, if $H$ is a complex Hilbert space. – Roland Mar 07 '16 at 16:05
@Roland And $c(h,k)$ should be defined as $\operatorname E[\langle h,X\rangle\langle Y,k\rangle]=\operatorname E[\overline{\langle X,h\rangle}\langle Y,k\rangle]$ if $\mathbb H$ is a complex Hilbert space, right? – Cm7F7Bb Mar 08 '16 at 10:08
@martini Thank you for your answer (+1)! I'd like to clarify two points. (1) By applying the Cauchy-Schwarz inequality twice, $$ |c(h,k)| \le E[|\langle X,h\rangle||\langle Y,h\rangle|] \le E[|X||Y|]|h||k| \le( E|X|^2)^{1/2}(E|Y|^2)^{1/2}|h||k|. $$ Is this what you mean by bounded? (2) The Riesz representation theorem states that any element $H^*$ can be uniquely represented as $\varphi_x(y)=\langle y,x\rangle$ for some $x\in H$. Could you provide some details on how it follows that there exists a unique linear operator $C:H\to H$ such that $\langle Ch,k\rangle=c(h,k)$? – Cm7F7Bb Mar 09 '16 at 14:45
This is an answer to my previous questions. We use Theorem 12.8 on p. 310 of Walter Rudin Functional Analysis (2nd edition, 1991) to establish the existence and uniqueness of the operator $C$. – Cm7F7Bb Jun 20 '17 at 12:29
2

We have that $$ \langle Ch,k\rangle=\operatorname E[\langle X,h\rangle\langle Y,k\rangle]=\operatorname E\langle\langle X,h\rangle Y,k\rangle =\langle\operatorname E[\langle X,h\rangle Y],k\rangle. $$ How can we justify the last equality? How can we justify taking the expected value inside the inner product? – Cm7F7Bb Jun 20 '17 at 12:33

Understanding the definition of the covariance operator

1 Answers1

Linked