0

I have a matrix $A$ whose entries are each a function of a variable $\epsilon$, with $\epsilon>0$. This matrix arises from Radial Basis Function (RBF) interpolation, and is symmetric positive-definite.

I will write this as $A(\epsilon)$. My goal is to find a value $\epsilon*$ that sets the condition number of the matrix to a specific target. In other words, if $\kappa(A)$ is the condition number of $A$, I need to ensure that $\kappa(A) = \kappa_T$, where $\kappa_T$ is some target condition number. To do this, I numerically solve the equation \begin{align} log\left(\frac{\kappa\left(A(\epsilon)\right)}{\kappa_T} \right) = f(\epsilon) = 0. \end{align} for $\epsilon$. In Matlab, I currently do this with the in-built fzero(), which uses the Brent-Dekker method. This method converges pretty slowly; for small dense matrices of size $50 \times 50$, for example, fzero sometimes takes 30+ iterations.

However, it occurred to me that I could possibly do this with Newton's method, if only I could compute the derivative of the above quantity. If $\epsilon$ is a scalar, and $A_{ij} = \frac{1}{\sqrt{1 + (\epsilon r_{ij})^2}}$, where $r_{ij}$ is some scalar independent of $\epsilon$, how do I compute the derivative $\frac{\partial f}{\partial \epsilon}$?

I have no idea how to compute the derivative of the condition number of a matrix, since I am not really familiar with matrix calculus. I'd appreciate some help! Thanks!

3 Answers3

3

Using the Frobenius norm for algebraic convenience, the differential of the norm is easily found $$\eqalign{ \|A\|^2 &= A:A \cr 2\,\|A\|\,d\|A\| &= 2\,A:dA \cr d\|A\| &= \frac{A}{\|A\|} : dA \cr }$$ where the colon denotes the Frobenius Inner Product.


The differential of the inverse is also easy to determine $$\eqalign{ A\,A^{-1} &= I \cr dA\,A^{-1} + A\,dA^{-1} &= 0 \cr dA^{-1} &= -A^{-1}\,(dA)\,A^{-1} \cr }$$

The condition number is $$\eqalign{ \kappa &= \|A^{-1}\|\,\|A\| \cr\cr }$$ Use these 3 facts to find the differential and gradient of $\kappa$ $$\eqalign{ d\kappa &= \|A^{-1}\|\,d\|A\| + \|A\|\,d\|A^{-1}\| \cr &= \|A^{-1}\|\,\frac{A}{\|A\|}:dA + \|A\|\,\frac{A^{-1}}{\|A^{-1}\|}:dA^{-1} \cr &= \|A^{-1}\|\,\frac{A}{\|A\|}:dA - \|A\|\,\frac{A^{-1}}{\|A^{-1}\|}:A^{-1}\,dA\,A^{-1} \cr\cr &= \Bigg(\frac{\|A^{-1}\|^2A-\|A\|^2A^{-T}A^{-1}A^{-T}}{\kappa}\Bigg) :dA \cr }$$ Since $\big(d\kappa=\frac{\partial\kappa}{\partial A}:dA\big),\,$ the gradient must be $$\eqalign{ \frac{\partial\kappa}{\partial A} &= \frac{\|A^{-1}\|^2A-\|A\|^2A^{-T}A^{-1}A^{-T}}{\kappa} \cr }$$ Finally, the gradient wrt $\epsilon$ is given by $$\eqalign{ \frac{d\kappa}{d\epsilon} &= \frac{\partial\kappa}{\partial A} : \frac{\partial A}{\partial\epsilon}\cr }$$

I'm not sure why you want to complicate your problem by introducing a logarithm, since the following function will also yield the target condition number $$f(\epsilon) = \kappa(\epsilon) - \kappa_T = 0$$

hans
  • 1,804
  • Thank you! This is exactly what I was looking for. As to why I was using the log value, the target condition numbers I use are often very large. This way, I figured I would only be working with smaller numbers, less susceptible to numerical errors. However, now that I have Newton's method (thanks to you), I can play around and decide what to do with this. It may very well be that logs are unnecessary. – VarunShankar Mar 16 '16 at 04:11
  • I also want to add: that expression for $\frac{\partial \kappa}{\partial A}$ simplifies a fair bit when $A = A^T$. Excellent! – VarunShankar Mar 16 '16 at 04:18
0

Automatic differentiation can be used to (exactly) compute derivatives of any differentiable function that you can calculate. (I don't think you'll find a useful expression using standard analytic methods)

Wouter
  • 8,117
0

Hint: defining $\Delta(A) = (A,A)$, $N(A) = \|A\|$, $\iota(A) = A^{-1}$, $p(x,y) = xy$, $\operatorname{cond}$ is a composition: $$A\longmapsto(A,A)\longmapsto(A,\iota(A))\longmapsto (N(A),N(\iota(A)))\longmapsto N(A)N(\iota(A)),$$ $$p\circ(N\times N)\circ(Id\times\iota)\circ\Delta.$$ Now apply chain rule. This depends of the concrete norm used and see differential inverse matrix.