When I learned about Lieb's inequality, I meet this problem. In $\operatorname{Tr}( \exp{(H+\log{X})})$, $X$ is a square matrix, and the simplest case can be diagonal. $H$ is a Hermitian matrix, but I think it has not any effect on the gradient.
I have tried the first kind of calculation
The general formula for the gradient of the trace of this function applied to a matrix argument $X$ is [cf. paragraph 2.5 of The Matrix Cookbook ]
$\frac{\partial \operatorname{Tr} (F(X))}{\partial X} =f(X^{T})$, where $f(\cdot)$ is the scalar derivative $F(\cdot)$.
I am not sure what "scalar derivative" means. I understand it as replacing the matrix argument $X$ with a scale $x$, that is, in my calculations,
$F(X)= [\exp{((\log{X}+H))}]$, thus, $f(X^{T})= [\exp{((\log{X^{T}}+H))}] * (X^{T})^{-1}$ .
I am very puzzled by this result. If $X$ is a diagonal matrix, $H$ is not a non-diagonal matrix, then the gradient w.r.t. $X$ derived from the above equation is a non-diagonal. But $\frac{\partial \operatorname{Tr} (F(X))}{\partial X} $ should be diagonal when $X$ is limited to a diagonal matrix.
Next, I have tried the second kind of calculations
$d \operatorname{Tr}( \exp{(\log{X}+H)})= \operatorname{Tr} [d ( \exp{(\log{X}+H)})]$
$= \operatorname{Tr}[\int_{0}^{1} \exp{(\alpha(\log{X}+H))} d(\log{X}+H) \exp{((1-\alpha)(\log{X}+H))} d \alpha] $
$= \operatorname{Tr}[\int_{0}^{1} \exp{(\alpha(\log{X}+H))} \exp{((1-\alpha)(\log{X}+H))} d \alpha d(\log{X}+H)] $
$= \operatorname{Tr} [\exp{((\log{X}+H))} d(\log{X}+H) ] $
$= \operatorname{Tr} [\exp{((\log{X}+H))} ]\operatorname{Tr} [d(\log{X}+H) ] $
$= \operatorname{Tr} [\exp{((\log{X}+H))} ] [d \operatorname{Tr}(\log{X}+H) ] $
$= \operatorname{Tr} [\exp{((\log{X}+H))} ] (X^{T})^{-1} dX$
Can someone help me with the right calculation?
Fto be a scalar-to-scalar function that is applied to all the coordinates of a matrix. If this is right, then the rule can be applied whenHis a scalar. – Eman Yalpsid Oct 06 '21 at 10:30