2

Let $x,y,z\in\mathbb{R}^n$. I am trying to compute

$$ \frac{\partial}{\partial x} (x\circ y)^Tz\\ \frac{\partial}{\partial x} (x\circ y)^T(x \circ y) $$

where $x\circ y$ is the Hadamard product of $x$ and $y$, but it is throwing me for a loop. Can someone show me how to proceed with these derivatives?

Based on this answer, it appears that I can write $f(x,y)=(x\circ y)^T(x\circ y) = (x\circ y)^TI(x\circ y)$ and thus

$$ \frac{\partial f}{\partial x} = y\circ (I^T+I)(x\circ y) $$ but I am confused about the first part, $y\circ(I^T+I)$. The dimensions do not seem to match up properly since $y\in\mathbb{R}^n$ and $I\in\mathbb{R}^{n\times n}$.


Context: I would like to compute the gradient of the following:

$$ \begin{split} ||x-\alpha\circ y||_2^2 &= (x-\alpha\circ y)^T(x-\alpha\circ y)\\ & = x^Tx - x^T(\alpha\circ y) - (\alpha\circ y)^Tx + (\alpha\circ y)^T(\alpha\circ y) \end{split} $$

with respect to $\alpha$ as part of the derivation of a gradient descent update. If there is a simpler way to compute the derivative of this 2-norm, please share; however, I'd still like to know how to compute the individual derivatives as well!

scherm
  • 125

1 Answers1

4

The elementwise/Hadamard product $\,(A\circ B)\,$ and the inner/Frobenius product $\,(A:B={\rm tr}(A^TB))\,$ are mutually commutative, i.e. $$\eqalign{ A\circ B &= B\circ A \cr A:B &= B:A \cr A\circ B:C &= A:B\circ C \cr }$$ These products are defined for matrices of any shape (including vectors), as long as $\{A,B,C\}$ have the same shape

Applying these rules to your first function $$\eqalign{ f &= z:x\circ y = z\circ y:x \cr df &= z\circ y:dx \cr \frac{\partial f}{\partial x} &= z\circ y \cr\cr }$$ For your second function, let $z=x\circ y$ $$\eqalign{ g &= z:z \cr dg &= 2z:dz = 2z:y\circ dx = 2z\circ y:dx \cr \frac{\partial g}{\partial x} &= 2z\circ y = 2x\circ y\circ y \cr\cr }$$ Your final function is very similar to the second. This time set $z=(a\circ y-x)$ $$\eqalign{ h &= z:z \cr dh &= 2z:dz = 2z:y\circ da = 2z\circ y:da \cr \frac{\partial h}{\partial a} &= 2z\circ y = 2(a\circ y-x)\circ y \cr\cr }$$

greg
  • 40,033
  • Do you have any references where I can go to learn these rules? I need to compute something similar so it would be great to find something to explain the theory, how it works in practice, and provide some problems. – the_src_dude Dec 11 '19 at 12:20