I wish to find the derivative $L^{-1}xx^TL^{-1}$ with respect to the symmetric positive definite matrix $L$, and $x$ is a vector. How do I proceed?
-
Welcome to MSE. Your question is phrased as an isolated problem, without any further information or context. This does not match many users' quality standards, so it may attract downvotes, or be closed. To prevent that, please [edit] the question. This will help you recognize and resolve the issues. Concretely: please provide context, and include your work and thoughts on the problem. These changes can help in formulating more appropriate answers. – José Carlos Santos Jun 13 '23 at 11:13
-
Quick beginner guide for asking a well-received question + please avoid "no clue" questions – Anne Bauval Jun 13 '23 at 11:57
-
1Related – Rodrigo de Azevedo Jun 13 '23 at 12:08
-
What kind of derivative? A Jacobian-like thing would be 4-dimensional – Rodrigo de Azevedo Jun 13 '23 at 12:09
2 Answers
$
\def\E{{\cal E}} \def\d{\delta} \def\L{L^{-1}}
\def\LR#1{\left(#1\right)}
\def\op#1{\operatorname{#1}}
\def\vc#1{\op{vec}\LR{#1}}
\def\trace#1{\op{Tr}\LR{#1}}
\def\frob#1{\left\| #1 \right\|_F}
\def\qiq{\quad\implies\quad} \def\k{\otimes}
\def\p{\partial} \def\grad#1#2{\frac{\p #1}{\p #2}}
\def\c#1{\color{red}{#1}}
\def\CLR#1{\c{\LR{#1}}}
$Given the symmetric matrix-valued function
$$\eqalign{
F &= \L xx^T\L \;\doteq\; F^T \qquad \qquad \qquad \qquad \qquad \qquad
}$$
calculate its differential with respect to $L$ (which is also symmetric)
$$\eqalign{
dF &= \CLR{d\L}xx^T\L + \L xx^T\CLR{d\L} \\
&= \CLR{-\L\,dL\:\L}xx^T\L + \L xx^T\CLR{-\L\,dL\:\L} \\
&= -\LR{\L\,dL\:F + F\:dL\:\L} \\
}$$
Vectorize this expression and rearrange it into a matrix-valued gradient
$$\eqalign{
d\ell &= \vc{dL} \\
df &= \vc{dF} \;=\; -\LR{F\k\L + \L\k F} d\ell \\
\grad{f}{\ell} &= -\LR{F\k\L + \L\k F} \\\\
}$$
Another approach is to introduce the fourth-order identity tensor $\E$ with
components
$$\eqalign{
\E_{ijkl} = \grad{L_{ij}}{L_{kl}} \;=\; \d_{ik}\,\d_{jl}
}$$
and calculate a tensor-valued gradient
$$\eqalign{
dF &= -\LR{\L\E F + F\E\L}:dL \\
\grad{F}{L} &= -\LR{\L\E F + F\E\L} \\\\
}$$
Yet another approach is to write the differential using index notation.
This yields the scalar components of the gradient
$$\eqalign{
\grad{F_{ij}}{L_{kl}} &= -\LR{\L_{ik}F_{jl} + F_{ik}\L_{jl}} \\
}$$
- 40,033
First, note that the derivative of $g(L)=L^{-1}$ is the linear function $Δ↦ -L^{-1}ΔL^{-1}$, which follows from
$$\begin{aligned} g(L+Δ) &= (L+{Δ})^{-1} \\&= (L(+L^{-1}{Δ})^{-1} \\&= (+L^{-1}{Δ})^{-1}L^{-1} \\&= \big( - L^{-1}{Δ} + (‖Δ‖²)\big) L^{-1} \qquad\text{(Neumann Series)} \\&= g(L) - L^{-1} {Δ} L^{-1} + (‖Δ‖²) \end{aligned}$$
Consider the function $f(L) = L^{-1} xx^⊤ L^{-1}$, then
$$\begin{aligned} f(L+Δ) &= (L+{Δ})^{-1}xx^⊤(L+Δ)^{-1} \\&= (L^{-1} - L^{-1}ΔL^{-1} + (‖Δ‖^2))xx^⊤(L^{-1} - L^{-1}ΔL^{-1} + (‖Δ‖^2)) \\&= f(L) - L^{-1}ΔL^{-1}xx^⊤L^{-1} - L^{-1}xx^⊤L^{-1}ΔL^{-1} +(‖Δ‖^2) \end{aligned}$$
Hence, the derivative of $f$ at $L$ is the linear function $$\boxed{Δ ⟼ - L^{-1}ΔL^{-1}xx^⊤L^{-1} - L^{-1}xx^⊤L^{-1}ΔL^{-1}}$$
This function can be represented as a 4d-tensor, using the formula $AXB^⊤ = (A⊗B)⋅X$, $^{(*)}$ where $(A⊗B)_{ij, kl} ≕ A_{ik}B_{jl}$ is a 4d tensor and $(A⊗B)⋅X ≕ ∑_{kl}(A⊗B)_{ij, kl}X_{kl}$ is a 2d tensor contraction. Using the symmetry of $L$ we have:
$$\begin{align} f(L) &= [Δ⟼- L^{-1}ΔL^{-1}xx^⊤L^{-1} - L^{-1}xx^⊤L^{-1}ΔL^{-1}] &&∈ \text{Lin}(ℝ^{n×n},ℝ^{n×n}) \\&≅ - L^{-1}⊗L^{-1}xx^⊤L^{-1} - L^{-1}xx^⊤L^{-1}⊗L^{-1} &&∈ ℝ^{n×n}⊗(ℝ^{n×n})^* \end{align}$$
(*): Note that one of the linked posts has $ = (B^⊤⊗A)X$, this is just a result of different convention on how $⊗$ is defined.
- 12,204
- 1
- 22
- 52