The positive-definite-ness of RBF kernel

Question

In Micchelli's paper Interpolation of Scattered Data: Distance Matrices and Conditionally Positive Definite Functions it mentioned that the RBF kernel $e^{-\alpha^2\|x^i-x^j\|^2/2}$ is positive definte because $$ e^{-\alpha^2\|x^i-x^j\|^2/2}=\left(2\pi\right)^{-s/2}\int_{\mathbb{R}^s}e^{i\alpha x\cdot x^i}e^{-i\alpha x\cdot x^j}e^{-\|x\|^2/2}\,dx $$ and the linear independence of $e^{ix\cdot x^1},\ldots,e^{ix\cdot x^n},x\in\mathbb{R}^s$. This looks fine with me since the kernel is a Gram matrix in this way.

But when I calculate the RBF kernel for the s-curve data (with $\alpha^2=4$), it turns out that the eigenvalues of the kernel is not always positive. Is it a computational issue? If so, what is the robust way to calculate the eigenvalues? (I need to calculate the $\mathrm{logdet}$ actually, so robust methods for $\mathrm{logdet}$ are more welcome.)

Also, since many eigenvalues of the RBF kernel are very small, the condition number may be quite large. Can somebody give references for the condition number of RBF kernel? I've found the paper "Narcowich, F and Ward, J (1991). Norms of inverses and condition numbers of matrices associated with scattered data Journal of Approximation Theory 64: 69-94." in this page, but it's quite technical and general for me so I can't find the result specially for RBF kernel.

Thank you.

The matrix is likely ill-conditioned. That's why you got may small-sized (close to machine epsilon) negative eigenvalues despite the matrix is in theory positive definite. — user1551, Feb 02 '13 at 06:24
@user1551 Then how to calculate it? RBF kernel is quite common I think there should be some robust algorithms. — Ziyuan, Feb 02 '13 at 11:29
I don't know, but why would you want to keep those small eigenvalues? I think typically one would (conceptually) use SVD/PCA to do dimension reduction, and those small eigenvalues are simply ignored. — user1551, Feb 02 '13 at 13:11
@user1551 As I mentioned, I am calculating $\log\det$. Simply calling $\det$ will produce zero, while summing up the logarithm of eigenvalues is OK for large $\alpha$ since no eigenvalue will be calculated as negative. — Ziyuan, Feb 05 '13 at 15:01

score 1 · Answer 1 · answered Apr 13 '24 at 02:22

As the comments have said, this is because of poor conditioning of the kernel matrix. You can always just set negative values to 0, but that's not very elegant.

In application, you will want to apply Tikhonov Regularization. This is done by adding a tiny amount (think 1e-10) to the diagonal of your matrix which causes your eigenvalues to shift slightly upwards. I plotted the effect it had on the RBF kernel of a random set of 100 points in $\mathbb{R}^2$.

Most results of eigenvalue conditioning require your matrix be normal, so I can't speak to that part. See here

The positive-definite-ness of RBF kernel

1 Answers1