Building off of my previous question, I am trying to derive the normal equations for the least squares problem:
$$ \min_W \|WX - Y\|_2 \\ W \in \mathbb{C}^{N \times N} \quad X, Y \in \mathbb{C}^{N \times M} $$
The intuitive way of viewing this problem is that I am trying to predict a vector $y$ (of length $N$) from a corresponding $x$ vector using a matrix $W$, and to estimate $W$ I have multiple ($M$, to be precise) realizations of $x$ and $y$ packed into matrices $X$ and $Y$.
I am trying to define this in terms of a least-squares problem and derive the normal equations myself, but I'm running into issues in taking the derivative. To spell it out explicitly, I can re-state the above equation as:
$$ \min_W (WX - Y)^H (WX - Y) \\ = \min_W (X^H W^H - Y^H) (WX - Y) \\ = \min_W (X^H W^H W X - Y^H W X - X^H W^H Y + Y^H Y) $$
Now typically I would take the derivative with respect to $W$, set it equal to zero, and solve for $W$. However, my matrix calculus is rusty and everything I know is basically summed up on this webpage, where it explicitly states:
Note that the Hermitian transpose is not used because complex conjugates are not analytic.
Now, because of Michael C. Grant's answer I feel there must be some way of doing this, but I am at a loss as to how. Thank you all in advance!