30

I have $x= \exp(At)$ where $A$ is a matrix. I would like to find derivative of $x$ with respect to each element of $A$. Could anyone help with this problem?

7 Answers7

43

Considering the expression $x = \exp(tA)$ I can think of two derivatives.

First, the derivative with respect to the real variable $t$ of the matrix-valued function $t \mapsto \exp(tA)$. Here the result is easily derived from direct calculation of the series definition of the matrix exponential: \begin{align} \frac{d}{dt} \exp(tA) &= \frac{d}{dt} \left[ I+tA+\frac{1}{2}t^2A^2+\frac{1}{3!}t^3A^3+ \cdots\right] \\ &= A+tA^2+\frac{1}{2}t^2A^3+ \cdots \\ &= A\exp(tA) \end{align} Thus, $\frac{d}{dt} \exp(tA) = A\exp(tA)$. (Edited to fix typo)

Second, we can differentiate with respect to the component $A_{ij}$ of $A = \sum A_{ij} E_{ij}$ where $E_{ij}=e_ie_j^T$ is the matrix which is zero except in the $ij$-th spot where there is a $1$. In other words, $(E_{ij})_{kl} = \delta_{ik}\delta_{jl}$. I'll look at the derivative as a directional derivative essentially: calculate the difference along the $E_{ij}$ direction: considering $f(t,A)=\exp(tA)$ $$ \frac{\partial f}{\partial A_{ij}}= \lim_{h \rightarrow 0}\frac{1}{h} \left[\exp(t(A+hE_{ij}))-\exp(tA)\right]$$ I expect this can be simplified.

Ok, the matrix exponential satisfies the Baker-Campbell-Hausdorf relation: $$ \exp(A)\exp(B) = \exp\left(A+B+ \frac{1}{2}[A,B] + \cdots\right)$$ From this we derive the Zassenhaus formula, $$ \exp(A+B) = \exp(A)\exp(B)\exp\left(-\frac{1}{2}[A,B] + \cdots\right) $$ I'll use this to simplify $\exp( t(A+hE_{ij})) = \exp\left(tA+ thE_{ij}\right)$ $$ \exp\left(tA+ thE_{ij}\right) = \exp(tA)\exp\left( thE_{ij}\right) \exp\left( -\frac{1}{2}[tA,thE_{ij}]+ \cdots\right)$$ hence $$ \exp\left(tA+ thE_{ij}\right) = \exp(tA)\exp\left( thE_{ij} -\frac{1}{2}[tA,thE_{ij}]+ \cdots\right)$$ where I am omitting terms with $h^2,h^3,\dots$ as those vanish in the limit and I am also omitting terms with nested commutators of $A$ so the answer below is just the first couple terms in an infinite series flowing from the BCH relation. \begin{align} \frac{\partial f}{\partial A_{ij}}&= -\lim_{h \rightarrow 0}\frac{1}{h} \left[\exp(tA)-\exp(t(A+hE_{ij}))\right] \\ &=-\lim_{h \rightarrow 0}\frac{1}{h} \left[\exp(tA)-\exp(tA)\exp\left( thE_{ij} -\frac{1}{2}[tA,thE_{ij}]+ \cdots\right) \right] \\ &=-\exp(tA)\lim_{h \rightarrow 0}\frac{1}{h} \left[I-\exp\left( thE_{ij} -\frac{1}{2}[tA,thE_{ij}]+ \cdots\right) \right] \\ &=-\exp(tA)\lim_{h \rightarrow 0}\frac{1}{h} \left[I-I-thE_{ij} +\frac{1}{2}[tA,thE_{ij}]+ \cdots \right] \\ &=-\exp(tA)\left[-tE_{ij} +\frac{t^2}{2}[A,E_{ij}]+ \cdots \right]. \end{align} Note the terms linear in $h$ do survive the limit and there are such terms (indicated by the $+ \cdots$) stemming from $[tA,[tA,hE_{ij}]]$ and $[tA,[tA,[tA,hE_{ij}]]]$ etc. Now, you can calculate: $[A,E_{ij}] = \sum_{k=1}^n \left(A_{ki}E_{kj}-A_{jk}E_{ik} \right)$ so, $$ \frac{\partial f}{\partial A_{ij}} = -\exp(tA) \left[-tE_{ij}+ \frac{t^2}{2}\left(A_{ki}E_{kj}-A_{jk}E_{ik} +\cdots \right)\right]$$ For what it's worth, you can simplify the nested commutator: $$ [A,[A,E_{ij}]] = \sum_{k,l=1}^n \left( A_{lk}A_{ki}E_{lj}-2A_{ki}A_{jl}E_{kl}+A_{jl}A_{lk}E_{ik} \right)$$ Or, in Einstein notation, $$ [A,[A,E_{ij}]] = (A^2)_{li}E_{lj}-2A_{ki}A_{jl}E_{kl}+(A^2)_{jk}E_{ik}.$$ Anyway, I hope this helps. Notice if $i=j$ and $A$ is diagonal or if simply $[A,E_{ij}]=0$ then we obtain: $$ \frac{\partial }{\partial A_{ij}} \exp(tA) = t\exp(tA)E_{ij}.$$ (I fixed the sign-errors, sorry for all weridly placed minus signs: 9-18-21).

James S. Cook
  • 17,257
  • Thank you for the answer James. I am looking for differentiate with respect to the component Aij. I would be waiting for a more simplification. – darwin rajpal May 20 '15 at 14:06
  • @HAyAs thanks! Excellent edit. I wish more edits were substantial like yours. – James S. Cook Oct 17 '16 at 02:14
  • 2
    Thus the correct answer is $$ \frac{\partial }{\partial A_{ij}} \exp(tA) = t\exp(tA)E_{ij}.$$ Do you have a reference for this? I ask this because a question on SE is not suitable as a a reference and copying your proof without reference would be plagiarism. – my2cts Sep 07 '20 at 10:47
  • I wonder if someone can edit this answer so that it contains the correct answer, instead of finding it in a comment. – Robert Dodier Sep 17 '21 at 22:44
  • @RobertDodier I think it is fixed now. Honestly, it hurts my eyes to look at it currently. I'm sure it could be made more pretty. In any event, thanks for your interest in the calculation and edit. – James S. Cook Sep 18 '21 at 07:30
11

If $A\in{\mathbb R}^{n\times n}$, then you can use Higham's "Complex Step Approximation" to calculate each component $$ \frac {\partial f} {\partial A_{jk}} = {\rm Im}\bigg(\frac{f(A+ihE_{jk})}{h}\bigg) $$ where $f(A)={\rm exp}(tA)$ and $h=10^{-20}$.

greg
  • 718
  • 3
    Somebody downvoted this post; one wonders why... Indeed, the presented numerical method is very effective. –  Oct 17 '16 at 10:00
8

According to Derivatives of the Matrix Exponential and Their Computation (who reference Karplus, Schwinger, Feynmann, Bellman and Snider) the derivative can be expressed as the linear map (i.e. Fréchet derivative)

$$\frac{\rm{d} e^{At}}{\rm{d} A} = \Big(V\longmapsto\int_{0}^t e^{A(t-\tau)}Ve^{A\tau}\,\rm{d}\tau\Big)$$

Hyperplane
  • 12,204
  • 1
  • 22
  • 52
4

We may verify Hyperplane’s answer directly. The Fréchet derivative of the matrix exponential $$ f:A\mapsto e^A=I+A+\frac{1}{2!}A^2+\frac{1}{3!}A^3+\cdots $$ is \begin{align*} Df(A):H&\mapsto H+\frac{1}{2!}(HA+AH)+\frac{1}{3!}(HA^2+HAH+AH^2)+\cdots\\ &=\sum_{k=0}^\infty \sum_{r=0}^k \frac{1}{(k+1)!} A^r H A^{k-r}\\ &=\sum_{k=0}^\infty \sum_{r=0}^k \frac{B(k-r+1, r+1)}{r!(k-r)!} A^r H A^{k-r}\\ &=\sum_{k=0}^\infty \sum_{r=0}^k \frac{\int_0^1 s^r(1-s)^{k-r} ds}{r!(k-r)!} A^{k-r} H A^r\\ &=\int_0^1 \sum_{k=0}^\infty \sum_{r=0}^k \frac{(1-s)^{k-r}s^r}{r!(k-r)!}A^{k-r} H A^r ds\\ &=\int_0^1 \left(\sum_{n=0}^\infty \frac{1}{r!}(1-s)^rA^r\right) H \left(\sum_{n=0}^\infty \frac{1}{r!}s^rA^r\right) ds\\ &=\int_0^1 e^{(1-s)A} H e^{sA} ds.\\ \end{align*} For $g(A)=e^{tA}=(f\circ L)(A)$ where $L(A)=tA$, we have $$ Dg(A)(H)=Df\big(L(A)\big)DL(A)(H)=tDf(tA)(H). $$ Thererfore \begin{align*} Dg(A)(H) =\sum_{k=0}^\infty \sum_{r=0}^k \frac{t^{k+1}}{(k+1)!} A^r H A^{k-r} =t\int_0^1 e^{(1-s)tA} H e^{stA} ds =\int_0^t e^{(t-\tau)A} H e^{\tau A} d\tau. \end{align*} In particular, when $H=E_{ij}$,$$ \frac{\partial e^{tA}}{\partial A_{ij}}=\sum_{k=0}^\infty \sum_{r=0}^k \frac{t^{k+1}}{(k+1)!} A^r E_{ij} A^{k-r} =\int_0^t e^{(t-\tau)A} E_{ij} e^{\tau A} d\tau. $$

user1551
  • 149,263
1

Edit: see comment by loup blanc below for explanation of why I am incorrect.

The top answer that $D_{A_{ij}}\exp(A)=\exp(A) E_{ij}$ seems incorrect. In particular, I think the assumption

\begin{equation} \exp(A)\exp(B) = \exp\left(A+B+ \frac{1}{2}[A,B] + \cdots\right) \\ \Rightarrow \exp(A+B) = \exp(A)\exp(B)\exp\left(-\frac{1}{2}[A,B] + \cdots\right) \end{equation}

is wrong. Maybe I just can't see it, but I think the logic assumes $\exp(A+B)=\exp(A)\exp(B)$ which is not true for general matrices.

Otherwise, answers from elsewhere seem to provide correct answers: Derivative of the matrix exponential with respect to its matrix argument

jake
  • 13
  • This appears to be commentary on other answers and a link to other similar answers, rather than a direct answer to the question being asked. If you think that the other answers are wrong, you should either reply to those answers and hope that the original authors reappear to clarify, or post a new answer that is complete (and doesn't simply link to other questions). – Xander Henderson Nov 08 '17 at 04:14
  • Your "top answer" is correct because it is assumed that $A,E_{i,j}$ commute. Yet, the second formula is badly written. By Zassenhaus, $e^{A+B}=e^{A}e^{B}e^{-1/2[A,B]}e^{1/3[B,[A,B]]+1/6[A,[A,B]]}e^{\cdots}\cdots$ –  Nov 11 '17 at 15:42
1

I arrive at the following. The ij element of $e^A$ is $$e^A_{ij} = \sum_{n=0}^\infty \frac{1}{n!} A_i^{k_1} ... A_{k_{n-1}j} ~.$$ Each matrix element can be seen as an independent variable, so the derivative toward $A^{kl}$ is $$e^A_{ij} = \sum_{n=0}^\infty \frac{1}{n!} \sum_{p=0}^{n-1} A^p_{ik} A^{n-1-p}_{lj} ~.$$

For my purpose $t$ is less relevant. It can easily be added in without changing my result.

my2cts
  • 158
0

It might be interesting to posit a more formal answer based on an application of the general chain rule. Consider this, if $A = \sum_{kl} A_{kl}E_{kl}$ then $$ \frac{\partial A}{\partial A_{ij}} = \sum_{kl} \frac{\partial A_{kl}}{\partial A_{ij}}E_{kl} = \sum_{kl} \delta_{ik}\delta_{jl}E_{kl} = E_{ij}. $$ Thus, by the chain rule, we calculate: $$ \frac{\partial }{\partial A_{ij}}exp(tA) = exp(tA)\frac{\partial}{\partial A_{ij}}(tA) = exp(tA)tE_{ij} = texp(tA)E_{ij}. $$ Of course, unless we understand how to prove this chain rule, the calculation above is just formal nonsense. Is this formal nonsense, or is this the legitimate use of a chain rule which circumvents the lengthy calculation of my other answer ?

James S. Cook
  • 17,257