2

The equation is Eq 11.20 in ECONOMETRICS by BRUCE E. H ANSEN (2021).

I have no idea on the following equation. Which property of determinant is applied in this derivation?

\begin{equation} \hat{\mathbf{G}}=\arg\min_{\mathbf{G}}\frac{\text{det}\left(\mathbf{G}^{\prime}\left[\tilde{\mathbf{X}}^{\prime}\tilde{\mathbf{X}}-\tilde{\mathbf{X}}^{\prime}\tilde{\mathbf{Y}}\left[\tilde{\mathbf{Y}}^{\prime}\tilde{\mathbf{Y}}\right]^{-1}\tilde{\mathbf{Y}}^{\prime}\tilde{\mathbf{X}}\right]\mathbf{G}\right)}{\text{det}\left(\mathbf{G}^{\prime}\tilde{\mathbf{X}}^{\prime}\tilde{\mathbf{X}}\mathbf{G}\right)} \end{equation} \begin{equation} =\arg\max_{\mathbf{G}}\frac{\text{det}\left(\mathbf{G}^{\prime}\tilde{\mathbf{X}}^{\prime}\tilde{\mathbf{Y}}\left[\tilde{\mathbf{Y}}^{\prime}\tilde{\mathbf{Y}}\right]^{-1}\tilde{\mathbf{Y}}^{\prime}\tilde{\mathbf{X}}\mathbf{G}\right)}{\text{det}\left(\mathbf{G}^{\prime}\tilde{\mathbf{X}}^{\prime}\tilde{\mathbf{X}}\mathbf{G}\right)} \end{equation} where $\mathbf{G}\in\mathbb{R}^{k\times r}$, $\tilde{\mathbf{X}}\in\mathbb{R}^{n\times k}$, $\tilde{\mathbf{Y}}\in\mathbb{R}^{n\times m}$, det$\left(\mathbf{A}\right)$ is the determinant of matrix $(\mathbf{A})$.

Since det($\mathbf A+\mathbf B) \neq$ det($\mathbf A$) + det($\mathbf B$), I believe the following derivation is wrong \begin{equation} \hat{\mathbf{G}}=\arg\min_{\mathbf{G}}\frac{\text{det}\left(\mathbf{G}^{\prime}\left[\tilde{\mathbf{X}}^{\prime}\tilde{\mathbf{X}}-\tilde{\mathbf{X}}^{\prime}\tilde{\mathbf{Y}}\left[\tilde{\mathbf{Y}}^{\prime}\tilde{\mathbf{Y}}\right]^{-1}\tilde{\mathbf{Y}}^{\prime}\tilde{\mathbf{X}}\right]\mathbf{G}\right)}{\text{det}\left(\mathbf{G}^{\prime}\tilde{\mathbf{X}}^{\prime}\tilde{\mathbf{X}}\mathbf{G}\right)} \end{equation} \begin{equation} =\arg\min_{\mathbf{G}}\frac{\text{det}\left(\mathbf{G}^{\prime}\left[\tilde{\mathbf{X}}^{\prime}\tilde{\mathbf{X}}\right]\mathbf{G}-\mathbf{G}^{\prime}\left[\tilde{\mathbf{X}}^{\prime}\tilde{\mathbf{Y}}\left[\tilde{\mathbf{Y}}^{\prime}\tilde{\mathbf{Y}}\right]^{-1}\tilde{\mathbf{Y}}^{\prime}\tilde{\mathbf{X}}\right]\mathbf{G}\right)}{\text{det}\left(\mathbf{G}^{\prime}\tilde{\mathbf{X}}^{\prime}\tilde{\mathbf{X}}\mathbf{G}\right)} \end{equation} \begin{equation} =\arg\min_{\mathbf{G}}\frac{\text{det}\left(\mathbf{G}^{\prime}\left[\tilde{\mathbf{X}}^{\prime}\tilde{\mathbf{X}}\right]\mathbf{G}\right)-\text{det}\left(\mathbf{G}^{\prime}\left[\tilde{\mathbf{X}}^{\prime}\tilde{\mathbf{Y}}\left[\tilde{\mathbf{Y}}^{\prime}\tilde{\mathbf{Y}}\right]^{-1}\tilde{\mathbf{Y}}^{\prime}\tilde{\mathbf{X}}\right]\mathbf{G}\right)}{\text{det}\left(\mathbf{G}^{\prime}\tilde{\mathbf{X}}^{\prime}\tilde{\mathbf{X}}\mathbf{G}\right)} \end{equation} \begin{equation} =\arg\min_{\mathbf{G}}\left[1-\frac{\text{det}\left(\mathbf{G}^{\prime}\left[\tilde{\mathbf{X}}^{\prime}\tilde{\mathbf{Y}}\left[\tilde{\mathbf{Y}}^{\prime}\tilde{\mathbf{Y}}\right]^{-1}\tilde{\mathbf{Y}}^{\prime}\tilde{\mathbf{X}}\right]\mathbf{G}\right)}{\text{det}\left(\mathbf{G}^{\prime}\tilde{\mathbf{X}}^{\prime}\tilde{\mathbf{X}}\mathbf{G}\right)}\right] \end{equation} \begin{equation} =\arg\max_{\mathbf{G}}\left[\frac{\text{det}\left(\mathbf{G}^{\prime}\left[\tilde{\mathbf{X}}^{\prime}\tilde{\mathbf{Y}}\left[\tilde{\mathbf{Y}}^{\prime}\tilde{\mathbf{Y}}\right]^{-1}\tilde{\mathbf{Y}}^{\prime}\tilde{\mathbf{X}}\right]\mathbf{G}\right)}{\text{det}\left(\mathbf{G}^{\prime}\tilde{\mathbf{X}}^{\prime}\tilde{\mathbf{X}}\mathbf{G}\right)}\right] \end{equation}

Can anyone explain the derivation? Many thanks!

The above equation arised in reduced rank multivariate regression, $\tilde{Y}=\mathbf{B}^{\prime}\tilde{X}+e$, where $e$ is the random error. We require the rank$(\mathbf{B})=r$. To achieve this, $\mathbf{B}=\mathbf{G}\mathbf{A}^{\prime}$, where $\mathbf{G}\in\mathbb{R}^{k\times r}$ and $\mathbf{A}\in\mathbb{R}^{m\times r}$. The regression is obtained by MLE based on the data $\left\{ \left(\tilde{X}_{i},\tilde{Y}_{i}\right)\right\} _{i=1}^{n}$. To write it in the matrix form \begin{equation} \tilde{\mathbf{Y}}=\left(\begin{array}{c} \tilde{Y}_{1}^{\prime}\\ \vdots\\ \tilde{Y}_{n}^{\prime} \end{array}\right),\tilde{\mathbf{X}}=\left(\begin{array}{c} \tilde{X}_{1}^{\prime}\\ \vdots\\ \tilde{X}_{n}^{\prime} \end{array}\right). \end{equation} There are too many math symbols, here I only show my question. It looks like an issue of matrix determinant.

1 Answers1

3

Let $u=\tilde{X}G$ with rank $r$ and let $\newcommand{\PY}{\mathcal{P}_\tilde{Y}} \PY$ denote the orthogonal projection matrix for $\tilde{Y}$. Then the problem can be written as

$$ \begin{align} {}&\min_{u} \frac{\left\vert u'(I-\PY)u \right\vert}{|u'u|}\\ ={}&\min_u \left\vert(u'u)^{-1/2}\right\vert\cdot\left\vert u'(I-\PY)u\right\vert\cdot\left\vert(u'u)^{-1/2}\right\vert \\ ={}&\min_u \left\vert(u'u)^{-1/2}u'(I-\PY)u (u'u)^{-1/2}\right\vert\\ ={}&\min_v \left\vert v'(I-\PY)v\right\vert \\ ={}&\min_v \left\vert I_r-v'\PY v\right\vert \\ ={}&\min_v \prod_{j=1}^r \left[1-\lambda_j(v'\PY v)\right] \end{align}$$ where $v=u (u'u)^{-1/2}$ with $v'v=I_r$ and $\lambda_j(\cdot)$ denotes the $j$th largest eigenvalue.

Suppose $\tilde{Y}$ has rank $m>r$. Then $\PY$ has $m>r$ eigenvalues to be $1$, and other $n-m$ eigenvalues to be $0$. So the spectral decomposition of $\PY$ can be written as $$\PY =\begin{bmatrix} \underset{n\times r}{\underbrace{\Gamma_{1r}}} & \underset{n\times (m-r)}{\underbrace{\Gamma_{1(m-r)}}} & \underset{n\times (n-m)}{\underbrace{\Gamma_0}} \end{bmatrix} \begin{bmatrix} I_m & 0 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} \Gamma_{1r}' \\ \Gamma_{1(m-r)}' \\ \Gamma_0' \end{bmatrix}. $$

By Poincaré separation theorem, $$1\geq \lambda_1(v'\PY v) \geq\cdots \geq \lambda_r(v'\PY v) \geq 0. $$ Thus, the determinant of interest is $$\left\vert I_r-v'\PY v\right\vert=\prod_{j=1}^r \left[1-\lambda_j(v'\PY v)\right] \geq 0. $$ To minimize this determinant to $0$ subject to $v'v=I_r$, we can simply choose $v=\Gamma_{1r}$ so that $I_r-v'\PY v$ itself is $0$.

By the same reason, $0\leq\left\vert v'\PY v\right\vert\leq1$. Thus, the maximum of $\left\vert v'\PY v\right\vert$ is achieved by the same choice of $v=\Gamma_{1r}$, leading to $\prod_{j=1}^r \lambda_j(v'\PY v)=|I_r|=1$. This allowed us to work backwards, i.e.

$$\begin{align} \arg\min_v \left\vert I_r -v'\PY v\right\vert &=\arg\max_v \left\vert v'\PY v\right\vert \\ &=\arg\max_u \left\vert(u'u)^{-1/2}u'\PY u (u'u)^{-1/2}\right\vert\\ &=\arg\max_u \left\vert(u'u)^{-1/2}\right\vert\cdot\left\vert u' \PY u\right\vert\cdot\left\vert(u'u)^{-1/2}\right\vert \\ &=\arg\max_{u} \frac{\left\vert u' \PY u \right\vert}{|u'u|}. \end{align}$$

PS: These $\arg$ notations seem somewhat sloppy, since the minimizers (or maximizers) are not unique.

Zack Fisher
  • 2,481