Derivative of Quadratic Form of Cholesky Factor

Question

Let $\mathbf{\Sigma}$ be a symmetric positive definite matrix and let $\mathbf{L}$ be its lower Cholesky factor. Furthermore, let $\mathbf{D}$ be a diagonal matrix.

What is $$ \begin{align} \frac{\partial \text{vech} (\mathbf{L} \mathbf{D} \mathbf{L}^\top)}{\partial \text{vech}(\mathbf{\Sigma})^\top} \quad ? \end{align} $$

Probably this and this question is relevant.

score 2 · Accepted Answer · answered Jan 21 '22 at 13:49

We can use the chain rule,

$$ \begin{align} \frac{\partial \text{vech} (\mathbf{L} \mathbf{D} \mathbf{L}^\top)}{\partial \text{vech}(\mathbf{\Sigma})^\top} &= \frac{\partial \text{vech} (\mathbf{L} \mathbf{D} \mathbf{L}^\top)}{\partial \text{vech}(\mathbf{L})^\top} \frac{\partial \text{vech} (\mathbf{L})}{\partial \text{vech}(\mathbf{\Sigma})^\top} \\ &= \mathbf{G}^{+} \left( \mathbf{L} \mathbf{D} \otimes \mathbf{I} \right) \mathbf{F}^{\top} \left( \mathbf{G}^{+} \left(\mathbf{L} \otimes \mathbf{I}\right) \mathbf{F}^{\top} \right)^{-1}. \end{align} $$

To see this, let $\mathbf{G}$ be the duplication matrix, $\mathbf{G}^{+}$ its Moore-Penrose inverse, $\mathbf{F}$ the canonical elimination matrix (which consists of 0's and 1's only) and $\mathbf{K}$ the commutation matrix. Then $$ \begin{align} \mathrm{d} \mathrm{vech}\left( \mathbf{L} \mathbf{D} \mathbf{L}^{\top} \right) &= \mathbf{G}^{+} \, \, \mathrm{d} \mathrm{vec} \left( \mathbf{L} \mathbf{D} \mathbf{L}^{\top} \right) \\ &= \mathbf{G}^{+} \left( \mathrm{vec}\left( \mathrm{d} \mathbf{L} \mathbf{D} \mathbf{L}^{\top} \right) + \mathrm{vec}\left( \mathbf{L} \mathbf{D} \mathrm{d} \mathbf{L}^{\top} \right) \right) \\ &= \mathbf{G}^{+}\left(\mathbf{I} + \mathbf{K}_{pp}\right) \mathrm{vec}\left( \mathrm{d} \mathbf{L} \mathbf{D} \mathbf{L}^{\top} \right) \\ &= \mathbf{G}^{+} \left(\mathbf{I} + \mathbf{K}_{pp}\right) \left( \mathbf{L} \mathbf{D} \otimes \mathbf{I} \right) \mathrm{vec}\left( \mathrm{d} \mathbf{L} \right) \\ &= 2 \mathbf{G}^{+} \mathbf{G} \mathbf{G}^{+} \left( \mathbf{L} \mathbf{D} \otimes \mathbf{I} \right) \mathbf{F}^{\top} \mathrm{d} \mathrm{vech}\left( \mathbf{L} \right) \\ &= 2 \mathbf{G}^{+} \left( \mathbf{L} \mathbf{D} \otimes \mathbf{I} \right) \mathbf{F}^{\top} \mathrm{d} \mathrm{vech}\left( \mathbf{L} \right), \end{align} $$ and $$ \begin{align} \mathrm{d} \mathrm{vech}\left( \mathbf{\Sigma} \right) &= \mathbf{G}^{+} \mathrm{d} \mathrm{vec}\left( \mathbf{L} \mathbf{L}^{\top} \right) \\ &= \mathbf{G}^{+} \mathrm{vec}\left( \mathrm{d} \mathbf{L} \mathbf{L}^{\top} \right) + \mathbf{G}^{+} \mathrm{vec}\left( \mathbf{L} \mathrm{d} \mathbf{L}^{\top} \right) \\ &= \mathbf{G}^{+} \left(\mathbf{I} + \mathbf{K}_{pp}\right) \mathrm{vec}\left( \mathrm{d} \mathbf{L} \mathbf{L}^{\top} \right) \\ &= \mathbf{G}^{+} \mathbf{G} \mathbf{G}^{+} \mathrm{vec}\left( \mathrm{d} \mathbf{L} \mathbf{L}^{\top} \right) \\ &= 2 \mathbf{G}^{+} \left(\mathbf{L} \otimes \mathbf{I}\right) \mathrm{vec}\left( \mathrm{d} \mathbf{L} \right) \\ &= 2 \mathbf{G}^{+} \left(\mathbf{L} \otimes \mathbf{I}\right) \mathbf{F}^{\top} \mathrm{d} \mathrm{vech}\left( \mathbf{L} \right). \end{align} $$

$+\tt1;$ You can eliminate one of the matrix variables by replacing $,F,$ with $,G^+;$ — greg, Jan 21 '22 at 23:49
@greg, actually you have to keep in mind that $\mathbf{L}$ is a lower triangular matrix, which means that $\mathrm{vec} (\mathbf{L}) = \mathbf{F}^\top \mathrm{vech} (\mathbf{L})$. So we cannot replace $\mathbf{F}^\top$. — stollenm, Jan 22 '22 at 11:38
Agreed. But $F^T$ isn't technically a duplication or elimination matrix. Instead it's a way to create a triangular matrix from a vector of the appropriate length. I think this answer (which you've basically re-derived) uses a really good notation for this distinction. — greg, Jan 22 '22 at 17:44
@greg True, $\mathbf{F}^\top$ just maps the $\mathrm{vech}$ of a lower triangular into its $\mathrm{vec}$. The answer you are referring to is actually a re-derivation itself of a result by Lütkepohl (1989), Lemma 1. — stollenm, Jan 22 '22 at 19:51
@stollenm Do you really need this computation ? I imagine you are computing analytically some gradient of a scalar-valued function $\phi$ involving the matrix $\mathbf{L} \mathbf{D} \mathbf{L}^T$ and you are interested in the gradient $\frac{\partial \phi}{\partial \mathbf{\Sigma}}$ at some point. There are other tools than vech to do this. If I am correct ,can you give us the whole picture ? — Steph, Jan 27 '22 at 08:08

Derivative of Quadratic Form of Cholesky Factor

1 Answers1

Linked