3

The heat kernel function is as follow:

$$f(x_i,x_j)= \exp\left(-\frac{||x_i-x_j||^2_2}{\sigma}\right)$$

where $x_i$ and $x_j$ are two column vectors of matrix $X$. $\sigma$ is a nonzero constant. What is the derivative of kernel function? What about the second partial derivative?

###################################################

In addition, we know that $L=D-S$, where $L$ is a Laplacian matrix. $D$ is a degree matrix. $S$ is an affinity matrix. Here, $S$ can be composed by the above $f(x_i,x_j)$.

Therefore, what is the derivative of $Tr(QLQ^T)$ with respect to $X$?

where $Q \in \mathbb{R}^{d\times n}$ is a constant matrix, $L \in \mathbb{R}^{n\times n}$ is a variable that is related to $X$.

Nobody?

learner
  • 321

2 Answers2

3

To be clearer I will write the variables as $x$ and $y$ : $f(x,y) = \exp(-\frac{\|x-y\|_2^2}{\sigma})$.

If we write $x = (x_1, ..., x_n)$ and $y = (y_1, ..., y_n)$, we have $$\|x-y\|_2^2=\sum_{i=1}^n (x_i-y_i)^2$$

Hence $$\frac{\partial f}{\partial x_i} = - \frac{2}{\sigma}(x_i-y_i) \exp(-\frac{\|x-y\|^2}{\sigma})$$

and $$\frac{\partial f}{\partial y_i} = \frac{2}{\sigma}(x_i-y_i) \exp(-\frac{\|x-y\|^2}{\sigma})$$

The second order derivatives are easily computed using these formulas and the product rule.

  • Thanks. I have added some problem, can you help me? – learner Dec 09 '19 at 06:57
  • I don't know what are a degree matrix and an affinity matrix. I also don't understand where $X$ appears in this new function. I can just say that the trace is linear hence its derivative is itself. – Arthur Breton Dec 09 '19 at 07:02
  • Thank you for your help. Maybe, I need to see the literature again. – learner Dec 09 '19 at 07:07
3

This answer finds a nice way to write the distance matrix in terms of the Gram matrix, i.e. $$\eqalign{ G &= X^TX,\quad &g={\rm diag}(G) \\ A_{ij} &= \|x_i-x_j\|^2 &\implies A = g{\tt 1}^T + {\tt 1}g^T-2G \\ }$$ Define analogous quantities based on $Q$ instead of $X$ $$\eqalign{ H &= Q^TQ,\quad &h={\rm diag}(H) \\ B_{ij} &= \|q_i-q_j\|^2 &\implies B = h{\tt 1}^T + {\tt 1}h^T-2H \\ }$$ plus a few more matrices for later convenience $$\eqalign{ R &= -\frac{1}{2\sigma}S\odot B \\ M &= \Big({\rm Diag}(R{\tt 1}) - R\Big) \;=\; {\rm Laplacian}(R)\\ }$$ This problem defines two additional matrices and a scalar function.
(NB: The exp() function is applied element-wise and $\odot$ is the Hadamard product) $$\eqalign{ S &= \exp\left(\frac{-A}{\sigma}\right) \quad\implies dS = -\frac{1}{\sigma} S\odot dA \\ L &= {\rm Diag}(S{\tt 1}) - S \\ \phi &= {\rm Tr}(Q^TQL) \\ &= Q^TQ:\big({\rm Diag}(S{\tt 1}) - S\big) \\ &= \tfrac{1}{2}B:S \\ }$$ Calculate the differential and gradient of the scalar function. $$\eqalign{ d\phi &= \tfrac{1}{2}B:dS \\ &= -\frac{1}{2\sigma}S\odot B:dA \\ &= R:(dg\,{\tt 1}^T + {\tt 1}\,dg^T-2\,dG) \\ &= R{\tt 1}:dg + R^T{\tt 1}:dg - 2R:dG \\ &= 2\Big({\rm Diag}(R{\tt 1}) - R\Big):dG \\ &= 2M:(X^TdX+dX^TX) \\ &= 4M:X^TdX \\ &= 4XM:dX \\ \frac{\partial\phi}{\partial X} &= 4XM \\ }$$

In the above, the diag() function creates a vector from the diagonal of a matrix, while the Diag() function creates a diagonal matrix from a vector.

And the colon is a convenient product notation for the trace, i.e. $\;X:Y={\rm Tr}(X^TY)$.

The matrices $(A,B,G,H,L,M,R,S)$ are all symmetric.

greg
  • 40,033