1

Given a matrix $B = exp(A)$, how do we express $B_{ml}$ using $A_{ij}$? If $A$ is a diagonal matrix, this is very straightforward: $B_{ii} = exp(A_{ii})$. Can we generalize it to a real square matrix $A \in \mathbb{R}^{N \times N}$?

Motivation: I am struggling to find an expression of $\frac{\partial B}{\partial A_{ij}} = \sum_m \sum_l \frac{\partial B_{ml}}{\partial A_{ij}}$. Knowing how to express $B_{ml}$ using $A_{ij}$ will give me a lead. If there is any other way to achieve $\sum_m \sum_l \frac{\partial B_{ml}}{\partial A_{ij}}$ without explicitly knowing $B_{ml}$, that would also be awesome.

I've looked into some old posts like this, but I am not sure if these are applicable in my case. My lack of Lie Algebra knowledge seems to be a problem here.

2 Answers2

2

If $A$ is not defective, then it can be decomposed as $$\eqalign{ \def\Mi{M^{-1}} \def\Mt{M^T} \def\Mit{M^{-T}} \def\k{\otimes} \def\h{\odot} \def\c{\cdot} \def\LR#1{\left(#1\right)} \def\l{\lambda} \def\Diag{\operatorname{Diag}} \def\vc{\operatorname{vec}} \def\p{\partial} \def\grad#1#2{\frac{\p #1}{\p #2}} A &= M\,L\,\Mi\qquad\quad L = \Diag(\l_k) \\ }$$ and the Daleckii-Krein Theorem can be invoked to obtain the gradient $$\eqalign{ B &= f(A) \\ dB &= M\Big(R\h\big(\Mi\,dA\,M\big)\Big)\Mi \\ \grad B{A_{ij}} &= M\Big(R\h\big(\Mi\,E_{ij}\,M\big)\Big)\Mi \\ }$$ where $(\h)$ is the Hadamard product and the components of the $E_{ij}$ matrix are all zero except for the $(i,j)$ component which is equal to one.

The components of the $R$ matrix are $$\eqalign{ R_{k\ell} \;=\; \begin{cases} {\large\frac{f(\l_k)\,-\,f(\l_\ell)}{\l_k\,-\,\l_\ell}}\qquad{\rm if}\;\l_k\ne\l_\ell \\ \\ \quad{\small f'(\l_k)}\qquad\qquad{\rm otherwise} \\ \end{cases} \\ }$$ In the current problem, the function of interest is $$ f(z) = f'(z) = \exp(z) $$

The above result can be flattened into a matrix-valued gradient $$\eqalign{ \def\R{{\large\cal D}} a &= \vc(A),\quad b= \vc(B), \quad \color{red}{\R\equiv\Diag\big(\!\vc(R)\big)} \\ db &= \vc(dB) \;=\; \LR{\Mit\k M}\R\,\LR{\Mt\k\Mi}\, da \\ \grad ba &= \LR{\Mt\k\Mi}^{-1}\,\R\,\LR{\Mt\k\Mi} \\ }$$


A recent paper by Magnus extends this idea to defective matrices, but it is much more complicated.

On the other hand, if $A^T\!=A\,$ then the solution becomes much simpler, since $M$ can be replaced by an orthogonal matrix $$\eqalign{ \def\Qt{Q^T} \def\qiq{\quad\implies\quad} A &= Q\,L\,Q^T, \qquad \Qt = Q^{-1} \\ \grad B{A_{ij}} &= Q\Big(R\h\big(\Qt\,E_{ij}\,Q\big)\Big)\Qt \\ \grad ba &= \LR{Q\k Q}\R\,\LR{Q\k Q}^T \\ }$$

greg
  • 40,033
-2

If $A$ is diagonalisable, i.e. it can be written as $A = PDP^{-1}$ where $D$ is diagonal, then we are able to say that $\exp(A) = P \exp(D) P^{-1}$. Otherwise, it is not generally possible to calculate $\exp(A)$ (and indeed it may not even have a meaningful value).

Unfortunately that also means that it is not trivial to express the elements of $\exp(A)$ in terms of the elements of $A$.

ConMan
  • 27,579
  • Sad, but understandable. Afaik, many computing packages like Matlab or Scipy use Pade approximation to calculate exp(A). If the distribution of $A_{ij}$ is bounded, maybe I can take a derivative of Pade approximation? – lostintimespace Feb 21 '24 at 23:44
  • 2
    It's not at all true to say "Otherwise, it is not generally possible to calculate $\exp(A)$ (and indeed it may not even have a meaningful value)." It is certainly well-defined, and there are many ways to calculate $\exp(A)$. Some may be better than others, depending on the circumstances. – Robert Israel Feb 22 '24 at 00:23