5

I have two questions related to principal component analysis (PCA):

  1. How do you prove that the principal components matrix forms an orthonormal basis? Are the eigenvalues always orthogonal?

  2. On the meaning of PCA. For my assignment, I have to compute the first 5 principal components for twenty-four $60 \times 50$ images. I've done that. But then it asks me to show those $5$ principal components as images and comment on what I see. I can see stuff, but I don't understand what is important about what I am seeing. Assuming I am doing everything right, what I should I be seeing/commenting on?

First Five PCAs

  • It's the eigenvectors that are orthogonal. The eigenvalues tell you the variance of the data set in the direction of the associated eigenvector. Eigenvectors are always orthogonal, because they form a basis. I suggest posting pictures for your second question. – nomen Dec 18 '13 at 20:06
  • 1
    @nomen Eigenvectors can be orthogonal, and they can form a basis, but they are not orthogonal because they form a basis. – Emily Dec 18 '13 at 20:16
  • @nomen it was a slip of the keyboard, i meant eigenvector. Anyway, ou are right, i just placed the image – Gonçalo Silva Santos Dec 18 '13 at 20:21
  • Instead of looking at the individual principal components, it might be more insightful to look at the first, the sum of the first two, the sum of the first 3, etc, so that you can see how you are getting better and better approximations to your original picture. – Aaron Dec 18 '13 at 20:26
  • No problem. Here's a fun fact: every picture in your data set can be written as a linear combination of all the principle components. Or approximated pretty well as a linear combination of the first $n$ components (in your case, $n$ was previously decided to be $5$). So if I were you, I would comment on how the first principle component has little detail in the parts that are "different" between the pictures, but lots of detail where they are all the same (like the background of the image, the clothes, etc) – nomen Dec 18 '13 at 20:28

4 Answers4

3

When computing the PCA of some matrix, the eigenvectors are orthogonal because we symmetrize the matrix during the process. In general, eigenvectors of a matrix are not necessarily orthogonal; however, this is a property that holds for symmetric matrices.

The singular value decomposition of a matrix $A$ can be written as

$$A = U \Sigma V^T$$

where $U$ and $V$ are orthonormal. In practice, we don't often compute this because of numerical issues.

Instead, we might look at $$A^TA = V\Sigma^T U^T U \Sigma V^T = V\Sigma^2 V^T.$$

Equivalently, if we look at $A^TA$ as a symmetric, diagonalizable matrix, we can compute its eigendecomposition as $A^TA = W\Lambda W^T$ and we know that the eigenvectors found in the columns of $W$ are orthogonal due to symmetry.

The link between SVD and PCA is recognizing that these are the same things.

Emily
  • 36,334
2

You may have more luck at finding mathematical facts if you use the other name of PCA, which is SVD, singular value decomposition. Still another name for the same mathematical idea is Karhunen-Loeve transformation.

The properties of PCA that you asked about result from the properties of the spectral decomposition of symmetric matrices. They always have real eigenvalues, eigenspaces to different eigenvalues are orthogonal, inside an eigenspace an orthogonal basis of eigenvectors can be found, so that the transformation matrix can always be constructed to be orthogonal (orthonormal columns).

The first components should show common features of the images, like the average over all images.

copper.hat
  • 178,207
Lutz Lehmann
  • 131,652
1

PCA was originally defined as a sequence of optimization problems in which the the components (the sequence of vectors that are the optima) are required to be mutually orthogonal. Every principal component is chosen, using the optimization criterion, from the subspace orthogonal to the previously determined components, so of course all components will be orthogonal to each other.

The reformulation as an algebra problem of finding eigenvectors of the covariance matrix came later, and amounts to the theory of Rayleigh quotients, or principal axes of ellipsoids. Pearson's founding article on PCA does not use eigenvalues.

zyx
  • 36,077
0

The matrices coming from the PCA of $A$ are from the eigendecompositions of $A^TA$ and $AA^T$, and your question can be framed as a question about eigenvectors of real symmetric matrices.

Suppose that $v,w$ are two eigenvalues of $A=A^T$, with $Av=\lambda v, Aw=\mu w$. If $\mu\neq \lambda$, then simplifying $v^T (A w)=(Av)^Tw$ we have have $(\mu-\lambda)v^Tw =0$, and so $v^T w=0$. This shows that the different eigenspaces are orthogonal. Within any particular eigenspace, we can choose an orthonormal basis. Combining the orthonormal bases for each eigenspace, we have an orthonormal eigenbasis. Note that some choices were required, so the eigenbasis will not be unique, but that is fine.

As far as your second question, you have that the larger principal components of your matrix contain more of your original image, and so taking an image generated by using the first few principal components should give you a decent approximation, with the use of more components giving successively better approximations. The key insight is that it (hopefully) requires less data to generate these approximations than it takes to store the original image, so you can use PCA as a means of lossy image compression.

Aaron
  • 25,637