2

Suppose $X,Y\sim N(\mu,\Sigma)$ are bivariate normal variables. For simplicity I'll assume $X$ and $Y$ are centered on the origin.

I'm looking for a visual or geometric way to understand the quantity $\mathbb E(Y\mid X=x)$.

I read this question and see that $$ \mathbb E(Y\mid X=x)=\frac{\text{Cov}(X,Y)}{\text{Var}(X)}x. $$

What I've done so far: My thought process so far is very hand-wavy. I am trying to picture the eigenvectors of $\Sigma$. If $\text{Cov}(X,Y)>0$, then the distribution will look something like this.

Then $\mathbb E(Y\mid X=x)$ should lie underneath the line formed by eigenvector $e_1$ whenever $x>0$. Intuitively this is because the distribution has 'more mass' under the line. Conversely, $\mathbb E(Y\mid X=x)$ should lie above that line when $x<0$. And of course $\mathbb E(Y\mid X=0)$ should lie on the line itself. If $\text{Cov}(X,Y)<0$ the opposite should be true.

This line of thought isn't really leading anywhere.

I'm having trouble understanding what the coefficient $\text{Cov}(X,Y)/\text{Var}(X)$ represents.

RobPratt
  • 50,938
yyy
  • 31
  • 1
    Try solving $\text{argmin}_{a \in \mathbb{R}}E((Y - aX)^2)$. – Mason May 11 '24 at 02:16
  • This quantity is minimized, I think, when $a=\mathbb E(XY)/\mathbb E(X^2)$. How do I proceed from here? I understand that the numerator minus $\mathbb E(X)\mathbb E(Y)$ equals $\text{Cov}(X,Y)$ and the denominator minus $\mathbb E(X)^2$ equals $\text{Var}(X)$. But how would I manipulate these variables and what is the intuition here? – yyy May 12 '24 at 23:30
  • Now solve $\text{argmin}{a, b \in \mathbb{R}}E((Y - (b + aX)))^2)$. The general fact is $E(Y \mid X) = f(X)$, where $f = \text{argmin}{h}E((Y - h(X))^2)$. – Mason May 13 '24 at 03:06

1 Answers1

1

Visualize a bivariate normal pdf $f_{X.Y}(x,y)$ as a surface above the $x$-$y$ plane. The volume of the solid trapped between this surface and the $x$-$y$ plane is $$\int_{-\infty}^\infty\int_{-\infty}^\infty f_{X.Y}(x,y) \, \mathrm dx \, \mathrm dy = 1.$$ An important property of this solid is that the shape of every cross-section of the solid by a plane that is perpendicular to the $x$-$y$ plane is proportional to a univariate normal pdf. (Think of the solid as a piece of bologna. Then, no matter how you slice it, it is still bologna!) In particular, when this perpendicular plane is defined by $x=5$, say, then the cross-section is just the curve $f_{X.Y}(5,y)$. We know that $$\int_{-\infty}^\infty f_{X.Y}(5,y) \,\mathrm dy = f_X(5),$$ and so it must be that $\dfrac{f_{X.Y}(5,y)}{f_X(5)}$ is a pdf. Indeed, we can recognize $\dfrac{f_{X.Y}(5,y)}{f_X(5)}$ as $f_{Y\mid X=5}(y\mid X=5)$, the conditional pdf of $Y$ given that $X=5$. If we plug-and-chug, we get that
\begin{align}f_{Y\mid X=5}(y\mid X=5) &= \dfrac{f_{X.Y}(5,y)}{f_X(5)}\\ &= \frac{1}{2\pi\sigma_X\sigma_Y\sqrt{1-\rho^2}} \exp\left[-\frac{1}{2(1-\rho^2)}\left(\frac{5^2}{\sigma_X^2} -2\rho \frac{5}{\sigma_X}\frac{y}{\sigma_Y} + \frac{y^2}{\sigma_Y^2}\right)\right]\LARGE / \\ & \qquad\sqrt{2\pi}\sigma_X \exp\left[-\frac{5^2}{2\sigma_X^2}\right]\\ &= \frac{1}{\sqrt{2\pi}\sqrt{\sigma_Y^2(1-\rho^2)}}\exp\left[-\frac{1}{2\sigma_Y^2(1-\rho^2)}\left(y^2 - 2y\frac{5\rho\sigma_Y}{\sigma_X} + \left(\frac{5\rho\sigma_Y}{\sigma_X}\right)^2 \right)\right] \end{align} which can be recognized as a normal pdf with mean $E[Y\mid X=5) =\dfrac{5\rho\sigma_Y}{\sigma_X} = \dfrac{5\rho\sigma_X\sigma_Y}{\sigma_X^2} = \dfrac{5\operatorname{cov}(X,Y)}{\operatorname{var}(X)}$ and variance $\sigma_Y^2(1-\rho^2)$. More generally, we have that $E[Y\mid X=x) =\dfrac{x\rho\sigma_Y}{\sigma_X} = \dfrac{\operatorname{cov}(X,Y)}{\operatorname{var}(X)}x.$

Dilip Sarwate
  • 26,411