Let $\mathbb H$ be an arbitrary separable Hilbert space. The covariance operator $C:\mathbb H\to\mathbb H$ between two $\mathbb H$-valued zero mean random elements $X$ and $Y$ with $\operatorname E\|X\|^2<\infty$ and $\operatorname E\|Y\|^2<\infty$ is defined by $$ C(h)=\operatorname E[\langle X,h\rangle Y] $$ for each $h\in\mathbb H$. Why is the covariance operator defined in this way?
It seems that we can arrive at this definition if we try to generalise the definition of the covariance matrix. Suppose for the moment that $\mathbb H=\mathbb R^n$. Then the covariance matrix is given by $\operatorname E[XY^T]$ which is a bounded linear operator from $\mathbb R^n$ to $\mathbb R^n$. Also, we have that $$ \operatorname E[YX^T](h) =\operatorname E[YX^Th] =\operatorname E[Y\langle X,h\rangle] =\operatorname E[\langle X,h\rangle Y] $$ for each $h\in\mathbb R^n$. The rightmost expression only depends on the inner product $\langle\cdot,\cdot\rangle$ and maybe we can say that this is the definition of the covariance operator for the arbitrary $\mathbb H$. Is this the right intuition behind the definition of the covariance operator?
Any help is much appreciated!