I'm struggling a bit using the chain rule. Given the function $\phi$ defined as:
$\phi(x) = ||{A\bf{x}-b}||_2$
where $A$ is a matrix and $b$ is a vector.
What is the gradient $\nabla\phi$ and how should I proceed to compute it?
I'm struggling a bit using the chain rule. Given the function $\phi$ defined as:
$\phi(x) = ||{A\bf{x}-b}||_2$
where $A$ is a matrix and $b$ is a vector.
What is the gradient $\nabla\phi$ and how should I proceed to compute it?
I have almost the same result as in mostafa ayaz answer. We can use Matrix algebra to obtain the result.
$\phi(x)=||{A\bf{x}-b}||_2=\sqrt{(A\bf{x}-b)'(A\bf{x}-b)}=\sqrt{(\bf{x}'A'-b')(A\bf{x}-b)}$ $=(\bf{x}'A'A\bf{x}-2\bf{x}'A'b+b^2)^{\frac12}$
I have used that $\bf{x}'A'b=b'A\bf{x}$. They are both scalars.
Differentiating w.r.t $\bf{x}$ by using the chain rule.
$$\frac{d\phi(x)}{dx}=\frac12\cdot ((\bf{x}'A'-b')(A\bf{x}-b))^{-\frac12}\cdot (2A'A\bf{x}-2A'b)$$
$$=\frac{2\cdot (A'A\bf{x}-A'b)}{2\cdot( (\bf{x}'A'-b')(A\bf{x}-b))^{\frac12}}=\frac{ A'A\bf{x}-A'b}{ ||{A\bf{x}-b}||_2}=\frac{ A'(A\bf{x}-b)}{ ||{A\bf{x}-b}||_2}$$
Shouldn't be that hard! Let $$A=\begin{bmatrix}a_{11}&a_{12}&\cdots&a_{1n}\\a_{21}&a_{22}&\cdots&a_{2n}\\.\\.\\.\\a_{n1}&a_{n2}&\cdots&a_{nn}\end{bmatrix}$$and$$b=\begin{bmatrix}b_1\\b_2\\.\\.\\.\\b_n\end{bmatrix}$$therefore we have$$|Ax-b|_2=\sqrt{(a_{11}x_1+a_{12}x_2+\cdots+a_{1n}x_n-b_1)^2+(a_{21}x_1+a_{22}x_2+\cdots+a_{2n}x_n-b_2)^2+\cdots+(a_{n1}x_1+a_{n2}x_2+\cdots+a_{nn}x_n-b_n)^2}$$by differentiating we have:$$\dfrac{\partial\phi}{\partial x_i}=\dfrac{a_{1i}r_1+a_{2i}r_2+\cdots+a_{ni}r_n}{|Ax-b|_2}$$where$$Ax-b=\begin{bmatrix}r_1\\r_2\\.\\.\\.\\r_n\end{bmatrix}$$by integrating the results we obtain:$$\nabla\phi=\dfrac{A'\cdot(Ax-b)}{|Ax-b|_2}$$