0

I wish to find the derivative $L^{-1}xx^TL^{-1}$ with respect to the symmetric positive definite matrix $L$, and $x$ is a vector. How do I proceed?

2 Answers2

1

$ \def\E{{\cal E}} \def\d{\delta} \def\L{L^{-1}} \def\LR#1{\left(#1\right)} \def\op#1{\operatorname{#1}} \def\vc#1{\op{vec}\LR{#1}} \def\trace#1{\op{Tr}\LR{#1}} \def\frob#1{\left\| #1 \right\|_F} \def\qiq{\quad\implies\quad} \def\k{\otimes} \def\p{\partial} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\c#1{\color{red}{#1}} \def\CLR#1{\c{\LR{#1}}} $Given the symmetric matrix-valued function $$\eqalign{ F &= \L xx^T\L \;\doteq\; F^T \qquad \qquad \qquad \qquad \qquad \qquad }$$ calculate its differential with respect to $L$ (which is also symmetric) $$\eqalign{ dF &= \CLR{d\L}xx^T\L + \L xx^T\CLR{d\L} \\ &= \CLR{-\L\,dL\:\L}xx^T\L + \L xx^T\CLR{-\L\,dL\:\L} \\ &= -\LR{\L\,dL\:F + F\:dL\:\L} \\ }$$ Vectorize this expression and rearrange it into a matrix-valued gradient $$\eqalign{ d\ell &= \vc{dL} \\ df &= \vc{dF} \;=\; -\LR{F\k\L + \L\k F} d\ell \\ \grad{f}{\ell} &= -\LR{F\k\L + \L\k F} \\\\ }$$ Another approach is to introduce the fourth-order identity tensor $\E$ with components $$\eqalign{ \E_{ijkl} = \grad{L_{ij}}{L_{kl}} \;=\; \d_{ik}\,\d_{jl} }$$ and calculate a tensor-valued gradient $$\eqalign{ dF &= -\LR{\L\E F + F\E\L}:dL \\ \grad{F}{L} &= -\LR{\L\E F + F\E\L} \\\\ }$$ Yet another approach is to write the differential using index notation.
This yields the scalar components of the gradient $$\eqalign{ \grad{F_{ij}}{L_{kl}} &= -\LR{\L_{ik}F_{jl} + F_{ik}\L_{jl}} \\ }$$

greg
  • 40,033
0

First, note that the derivative of $g(L)=L^{-1}$ is the linear function $Δ↦ -L^{-1}ΔL^{-1}$, which follows from

$$\begin{aligned} g(L+Δ) &= (L+{Δ})^{-1} \\&= (L(+L^{-1}{Δ})^{-1} \\&= (+L^{-1}{Δ})^{-1}L^{-1} \\&= \big( - L^{-1}{Δ} + (‖Δ‖²)\big) L^{-1} \qquad\text{(Neumann Series)} \\&= g(L) - L^{-1} {Δ} L^{-1} + (‖Δ‖²) \end{aligned}$$

Consider the function $f(L) = L^{-1} xx^⊤ L^{-1}$, then

$$\begin{aligned} f(L+Δ) &= (L+{Δ})^{-1}xx^⊤(L+Δ)^{-1} \\&= (L^{-1} - L^{-1}ΔL^{-1} + (‖Δ‖^2))xx^⊤(L^{-1} - L^{-1}ΔL^{-1} + (‖Δ‖^2)) \\&= f(L) - L^{-1}ΔL^{-1}xx^⊤L^{-1} - L^{-1}xx^⊤L^{-1}ΔL^{-1} +(‖Δ‖^2) \end{aligned}$$

Hence, the derivative of $f$ at $L$ is the linear function $$\boxed{Δ ⟼ - L^{-1}ΔL^{-1}xx^⊤L^{-1} - L^{-1}xx^⊤L^{-1}ΔL^{-1}}$$


This function can be represented as a 4d-tensor, using the formula $AXB^⊤ = (A⊗B)⋅X$, $^{(*)}$ where $(A⊗B)_{ij, kl} ≕ A_{ik}B_{jl}$ is a 4d tensor and $(A⊗B)⋅X ≕ ∑_{kl}(A⊗B)_{ij, kl}X_{kl}$ is a 2d tensor contraction. Using the symmetry of $L$ we have:

$$\begin{align} f(L) &= [Δ⟼- L^{-1}ΔL^{-1}xx^⊤L^{-1} - L^{-1}xx^⊤L^{-1}ΔL^{-1}] &&∈ \text{Lin}(ℝ^{n×n},ℝ^{n×n}) \\&≅ - L^{-1}⊗L^{-1}xx^⊤L^{-1} - L^{-1}xx^⊤L^{-1}⊗L^{-1} &&∈ ℝ^{n×n}⊗(ℝ^{n×n})^* \end{align}$$

(*): Note that one of the linked posts has $ = (B^⊤⊗A)X$, this is just a result of different convention on how $⊗$ is defined.

Hyperplane
  • 12,204
  • 1
  • 22
  • 52