Higher derivatives of the map $I:T \mapsto T^{-1}$, where $T \in \mathcal B(X)$.

Question

Let $X$ be a Banach space, $\mathcal B(X;Y)$ denotes the set of bounded linear operators $X\to Y$. Consider the inverting map $I:U\subset\mathcal B(Y;X)\to \mathcal B(X;Y)$ defined by $I(T) = T^{-1}$, where $U$ is the set where this makes sense. It is known, e.g. here, that $I$ is (Frechet) differentiable and $$ I'(T)[A] = -T^{-1}AT^{-1}, $$ here $I'(T)$ is viewed as an element of $\mathcal B(\mathcal B(Y;X);\mathcal B(X;Y))$.

How do we prove that the $k^{\text{th}}$-derivative of $I$ is the $k$-multilinear map $$ (A_1,\dots,A_k) \mapsto (-1)^{k} \sum_{\sigma\in S_k} T^{-1}A_{\sigma(1)}T^{-1}\dots T^{-1}A_{\sigma(k)} T^{-1}, $$ where the sum is over all permutations $\sigma$ of $\{1,\dots,k\}$?

This formula is given in a book by Hormander without a proof (as usual). It looks like a symmetrization of the higher order terms in the Taylor expansion of $I$ (some details are seen in this thread).

To obtain higher order derivatives, I tried to differentiate $I'$ by writing $I' = -M\circ I$, where $M(T)[A] = TAT$, and repeatedly apply chain rule. However, the higher derivatives of $M$ gets ugly really fast (or that I don't know a clean way to write it down). Is there a nice way to prove this result?

I have actually managed to derive it myself yesterday. If no one answers the question soon I might consider writing it down myself (in case someone may find it useful). — BigbearZzz, May 06 '20 at 14:08

score 4 · Accepted Answer · edited Oct 12 '22 at 12:19

4

For small enough $t$ we have the power series expansion

$$ f(T + tS) = (T + tS)^{-1} = (T(1 - (-tT^{-1}S)))^{-1} = T^{-1} \sum_{k=0}^{\infty} (-1)^k (T^{-1}S)^k t^k. $$

Using it we see that,

$$ (D^k f)|_{T}(S, \dots, S) = \left( \frac{d}{dt} \right)^k f(T + tS)|_{t=0} = k! (-1)^k (T^{-1} S)^k. $$

Now we know that $D^k f|_{T}(S_1,\dots,S_k)$ is symmetric and uniquely determined from $D^k f|_{T}(S,\dots,S)$ by the polarization identity so we can just guess that $$ D^k f|_{T}(S_1,\dots,S_k) = (-1)^k \sum_{\sigma \in S_k} T^{-1} S_{\sigma(1)} \cdots T^{-1} S_{\sigma(k)} $$

and since this is symmetric in $S_1,\dots,S_k$ and coincides with our expression when $S_1 = \dots = S_k = S$, our guess must hold.

edited Oct 12 '22 at 12:19

3nondatur

4,404

answered May 06 '20 at 14:56

levap

67,610

This method is not so direct but the way you argued by symmetry is a very beautiful nonetheless. I would expect that this line of reasoning should work for similar problem whenever our function has a continuous $k^{\text{th}}$-order derivative? – BigbearZzz May 06 '20 at 15:05
Thanks, this was inspired by what you wrote in the question itself. The line of reasoning you can generalize is that whenever you want an expression for a $k$-multilinear symmetric form $B \colon V \times \cdots \times V \rightarrow \mathbb{F}$ when $\mathbb{F}$ has characteristic zero and you have an expression for $B(v,\dots,v)$ from which you can guess a candidate for $B(v_1,\dots,v_k)$ which is symmetric then your guess will be correct. In particular, you can apply it to $D^k f$ if you can compute it, I just used the power series expansion to get the $D^k f(v,\dots,v)$'s all at once. – levap May 06 '20 at 15:14
But I don't really know if this is useful, I never seen this argument used anywhere. – levap May 06 '20 at 15:14
I have decided to include my own answer as well, using direct calculation. I am accepting your answer though as it is very nice. – BigbearZzz May 07 '20 at 17:14

BigbearZzz · Answer 2 · 2020-05-07T17:19:30.973

I decided to post an answer to my own question as an addition to an already good answer by levap. My method will be a direct derivation based on the fact that $I(T)[A] = -T^{-1}AT^{-1}$, using induction.

The base case $k=1$ is already covered by the above formula (whose proof can be found here). Now, assume that $$ I^{(k)}(T)[A_1,\dots,A_k] = (-1)^{k} \sum_{\sigma\in S_k} T^{-1}A_{\sigma(1)}T^{-1}\dots T^{-1}A_{\sigma(k)} T^{-1}, $$ holds ($S_k$ is the symmetric group of order $k$). I will rewrite it as

$$ I^{(k)}(T)[A_1,\dots,A_k] = (-1)^{k} \sum_{\sigma\in S_k} (M_{k,\sigma}\circ I)(T)[A_1,\dots,A_k], $$

where $M_{k,\sigma}(T)$ is the $k$-linear map $ M_{k,\sigma}(T)[A_1,\dots,A_k] = T A_{\sigma(1)}T \dots T A_{\sigma(k)} T. $

After a little bit of calculation, we can see that $$\begin{align} M_{k,\sigma}(T+S)&[A_1,\dots,A_k] - M_{k,\sigma}(T)[A_1,\dots,A_k] \\ = \ \ \ & (S A_{\sigma(1)}TA_{\sigma(2)}T \dots T A_{\sigma(k)} T) + (T A_{\sigma(1)}S A_{\sigma(2)}T \dots T A_{\sigma(k)} T) + \dots \\ &\ \ \ + (T A_{\sigma(1)}TA_{\sigma(2)} T \dots T A_{\sigma(k)} S) + o(||S||), \end{align}$$ which implies that the derivative of $M_{k,\sigma}$ is given by
$$\begin{align} M'_{k,\sigma}(T)[A_1,\dots,A_k,B] &= (B A_{\sigma(1)}TA_{\sigma(2)}T \dots T A_{\sigma(k)} T) + (T A_{\sigma(1)}B A_{\sigma(2)}T \dots T A_{\sigma(k)} T) + \dots \\ &\ \ \ + (T A_{\sigma(1)}TA_{\sigma(2)} T \dots T A_{\sigma(k)} B). \end{align}$$

By the chain rule (for multilinear maps), we have $$\begin{align} (M_{k,\sigma}\circ I)'(T)[A_1,\dots,A_k,B] &= (M'_{k,\sigma}\circ I)(T)[A_1,\dots,A_k,I'(T)[B]] \\ &= (M'_{k,\sigma})(T^{-1})[A_1,\dots,A_k,-T^{-1}BT^{-1}] \\ &= (-T^{-1}BT^{-1}) A_{\sigma(1)}T^{-1} A_{\sigma(2)}T^{-1} \dots T^{-1} A_{\sigma(k)} T^{-1} + \dots \\ &\ \ \ \ + T^{-1} A_{\sigma(1)}T^{-1} A_{\sigma(2)} T^{-1} \dots T^{-1} A_{\sigma(k)} (-T^{-1}BT^{-1}) \end{align}$$

Lastly, we apply the above formula to the inductive step to get $$\begin{align} I^{(k+1)}(T)[A_1,\dots,A_k,A_{k+1}] &= (-1)^{k} \sum_{\sigma\in S_k} (M_{k,\sigma}\circ I)'(T)[A_1,\dots,A_k,A_{k+1}] \\ &= (-1)^{k} \sum_{\sigma\in S_k} (-T^{-1}A_{k+1}T^{-1} A_{\sigma(1)}T^{-1} A_{\sigma(2)}T^{-1} \dots T^{-1} A_{\sigma(k)} T^{-1} - \dots \\ &\quad\quad\quad\quad\quad\quad - T^{-1} A_{\sigma(1)}T^{-1} A_{\sigma(2)} T^{-1} \dots T^{-1} A_{\sigma(k)}T^{-1}A_{k+1} T^{-1}) \\ &= (-1)^{k+1} \sum_{\rho\in S_{k+1}} (M_{k+1,\rho}\circ I)(T)[A_1,\dots,A_{k+1}] \end{align}$$ where the last equality can be verified readily that such permutations really go through all $\rho\in S_{k+1}$. This concludes the proof.

Higher derivatives of the map $I:T \mapsto T^{-1}$, where $T \in \mathcal B(X)$.

2 Answers2

Linked