Geometrical interpretations of SVD

Question

I'm a bit confused by the various geometrical/visual interpretations of SVD or better I'm wondering how to reconcile them.

Transformations : As explained here, the 3 matrices produced by the SVD can be interpreted as rotation, scaling, rotation.
Projection on an axis (dimensionality reduction): As explained here, SVD enables to capture the most of the variance in the data by selecting the right orthogonal axes.

My (very naive) question is: are these 2 interpretations related visually or not at all, and if yes, how? Or are they 2 unrelated applications of SVD?

As far as I understand, the transformations of the matrices (1) can be applied to anything/any object, so it's not about the data mentioned in (2) so I would say it's not related but I may be wrong (because in (2) we also talk about rotations...).

Please note that I'm looking for the intuition with as few math as possible (I don't know much linear algebra as you may have guessed). Many thanks.

They are somehow related through the diagonal matrix. So for the first interpretatition, matrix $D$ is responsible for scaling. It is actually the matrix, which gives the mass.. so how much mass, is the question answered by part $2$.,namely the dimensionality reduction. So the scaling part has common interpretations both in part $1$ and $2$. — Seyhmus Güngören, Sep 24 '15 at 20:47
I think that the second youtube has a problem with the dimensions of the matrices in the SVD de-composition. If A is [7X5] then the SVD decomposition will be = [7x7][7x5][5x5] — ihadanny, Feb 20 '16 at 20:55
Here's a nice picture that illustrates the first viewpoint you mentioned: http://math.stackexchange.com/questions/243811/visualization-of-singular-value-decomposition-of-a-symmetric-matrix — littleO, Sep 07 '16 at 12:59
This does not really answer the question. If you have a different question, you can ask it by clicking Ask Question. You can also add a bounty to draw more attention to this question once you have enough reputation. - From Review — user577215664, May 11 '21 at 17:07
This does not provide an answer to the question. Once you have sufficient reputation you will be able to comment on any post; instead, provide answers that don't require clarification from the asker. - From Review — Leucippus, May 11 '21 at 17:25

Ken Wei · Answer 1 · 2015-09-24T21:10:48.613

A matrix represents a linear transformation that rotates, scales and shears whatever you put into it; so feeding the coordinates of a square could potentially give you, say, a parallelogram. An important fact is that there is a one-to-one correspondence between all real matrices and linear transformations: if you can think of a linear transformation, then there is a way to write it as a matrix.

SVD is based on a theorem that says any matrix $\mathbf A$ can be written in the form $\mathbf{U\Sigma V}^T$ where $\mathbf U$ and $\mathbf V$ are strictly rotations and $\mathbf \Sigma$ is a matrix that scales. So, any linear transformation can be broken down into 3 steps, i.e. rotate first, stretch/scale (not necessarily by the same amount in all directions; you could stretch the x-axis twice as much as the y-axis), and rotate again.

For instance, to transform a square into a paralellogram, you could rotate clockwise by $\theta$ (the value of this is not too important as long as you pick a sensible number as the rotation matrices are not unique), scale the axes by different factors, then rotate counter-clockwise again by $\theta$.

Points 1 and 2 are related in the following way: a projection (point 2) is a 'simplified' transformation. Suppose you had a transformation that changes a 1x1 square into a 10x0.1 rectangle. A projection would be to simply say that this transformation changes the square into a 10x0 'rectangle' (which is a line). This is dimensionality reduction: your 2-dimension square is projected onto a 1-dimensional line. If you did an SVD with this, $\mathbf U$ and $\mathbf V$ would be the identity matrices, and $\mathbf \Sigma$ would be a diagonal matrix (as it always is) with entries 10 and 0.1.

The key point to understanding the dimensionality reduction part is to completely forget about rotations: by the SVD decomposition theorem, rotations are irrelevant and can be 'added' in later or earlier; you only want to know the way in things scale (along different axes), so the SVD helps you strip away the rotation: a matrix that turns a square into a parallelogram can be seen as something that scales a square into a rectangle (between two rotations). Having something scale to a small value (relative to everything else) means that you can pretend it scales to zero, which in the context of transformations, is a projection and approximates the original transformation.

To summarise the answer to your question: when your transformation is just scaling, and one of the scales is relatively small, you can replace the smallest scale factor with zero, and this gives you a projection. SVD tells you that all transformations can be expressed as a scaling between two rotations, and the idea of dimensionality reduction is to replace the scaling with a projection. 'Selecting the right axes' refers to the rotation: you want to 'project away' only once you are sure that you lose as little as possible by first rotating your shape (or data).

Saying that $U$ and $V$ are rotation matrices is not entirely precise. Unless there is a specific theorem showing that these two matrices have determinant 1 (of which I am not aware of), these matrices will be a composition of rotations and reflections. — user347489, Apr 30 '25 at 02:25

score 3 · Answer 2 · answered Sep 07 '16 at 12:42

3

Here's an interactive worksheet to help you understand (1) https://www.geogebra.org/m/mrey8VJX

answered Sep 07 '16 at 12:42

murkle

420

Geometrical interpretations of SVD

2 Answers2

Linked