Variance of dot product

Question

When reading a proof I came across the following step:

$$\operatorname{Var}(x^Ty) = x^T\operatorname{Var}(y)x$$ $x$ and $y$ are column vectors. How can you derive this?

Actually I think $x$ is random too.
The full equation is:

$$Var(\hat\beta_p) = Var(\frac{z_p^Ty}{\langle z_p,z_p\rangle}) = \frac{z_pVar(y)z_p}{\langle z_p,z_p\rangle^2}=\frac{z_p^T(\sigma^2I)z_p}{\langle z_p,z_p\rangle^2}=\frac{\sigma^2}{\langle z_p,z_p\rangle^2}$$

The context is fitting a linear model with least-squares loss using orthogonalization. $z_p$ is the orthogonal residual from regressing $x_p$ (the p-th column of the design matrix) on the previous residuals, and $\hat\beta_p$ is the projection of $y$ onto $z_p$. The data samples are random so I think $z_p$ is random too. — Jake Grimes, Jan 09 '17 at 05:19
It is somehow confusing. $Var(x^T y)$ is not random, while $x^T Var(y)x$ is random. Am I missing some point? — Cave Johnson, Jan 09 '17 at 05:29
Typos in the equation fixed: $$Var(\hat\beta_p) = Var(\frac{z_p^Ty}{\langle z_p,z_p\rangle}) = \frac{z_p^TVar(y)z_p}{\langle z_p,z_p\rangle^2}=\frac{z_p^T(\sigma^2I)z_p}{\langle z_p,z_p\rangle^2}=\frac{\sigma^2}{\langle z_p,z_p\rangle}$$ — Jake Grimes, Jan 09 '17 at 05:34
I don't get what you are asking. $y$ is a random column vector and I think $x$ is too. But the variance of $x^Ty$ is a constant. — Jake Grimes, Jan 09 '17 at 05:36
The equation you put in your question,i.e., $\mathrm{Var}(x^T y)=x^T \mathrm{Var}(y)x$, is a little different from the one you mentioned in your comments, and the former seems wrong. You already noticed $\mathrm{Var}(x^T y)$ is a constant while $x^T \mathrm{Var}(y)x$ is a random variable, then how is those two supposed to be equal? — Cave Johnson, Jan 09 '17 at 05:39
According to Wikipedia this is just a basic property of Variance matrices: https://en.wikipedia.org/wiki/Covariance_matrix (Property 3) I still don't see how it was derived though. — Jake Grimes, Jan 09 '17 at 05:43
It looks like you can do it with Property 1 from the above page. — Jake Grimes, Jan 09 '17 at 05:45

score 15 · Answer 1 · edited Dec 05 '20 at 20:57

Let us understand what is meant by the "variance" of a column vector. Suppose $y$ is a random vector taking values in $\mathbb R^{n\times1},$ and let $\mu =\mathbb E[y].$ Then we define $$ \operatorname{cov}(y) = \operatorname{E}((y-\mu)(y-\mu)^T) \in \mathbb R^{n\times n}. $$ Here we assumed that $y$ is random. For what we do next, we must assume $x$ is not random. We have \begin{align} & \operatorname{var}(x^T y) = \operatorname{E}\Big( \big(x^T(y-\mu)\big)\big(x^T(y-\mu)\big)^T \Big) \\[10pt] = {} & \operatorname{E}\Big( x^T(y-\mu) (y-\mu)^T x\Big) \\[10pt] = {} & x^T \operatorname{E}\Big((y-\mu)(y-\mu)^T\Big) x \qquad \text{because } x \text{ is not random,} \\[10pt] = {} & x^T \operatorname{cov}(y) x. \end{align}

Variance of dot product

1 Answers1

Linked

Related