11

I know that the eigenvectors of a Laplacian matrix of a graph are so important. They show the locality over the graph (as I know). But whatever I've read about an eigenvector of Laplacian graph is about the smoothness of the eigenvector and the relative value of neighbor nodes in an eigenvector (how close the value of two neighbor nodes are in an eigenvector). Is there any intuition behind the exact values in an eigenvector?

For example, if $u_i$ is an eigenvector and $u_{ij}$ shows the $j$-th value of $u_i$, and $u_{ij}$ is the maximum value in $u_{i}$, can we say anything about node $j$? Do higher values show anything relative to lower values? Or does the sign of the value show anything?

I want to get an intuition about the values of an eigenvector.

3 Answers3

19

For intuition, we want to formulate eigenvector-finding as an optimization problem.

Let $A$ be any symmetric matrix. If we minimize $\frac{\mathbf x^{\mathsf T}A \mathbf x}{\mathbf x^{\mathsf T} \mathbf x}$ over all nonzero $\mathbf x$ (or, equivalently, minimize $\mathbf x^{\mathsf T}A \mathbf x$ over all $\mathbf x$ with $\|\mathbf x\|=1$), then we get the smallest eigenvalue back, with $\mathbf x$ being its eigenvector.

This is true for the Laplacian matrix $L$, except this minimum won't be very interesting: the smallest eigenvalue is always $0$, and $(1,1,\dots,1)$ is always an eigenvector. We can look at the second-smallest eigenvalue instead, by adding an extra constraint on $\mathbf x$: we can ask that $x_1 + x_2 + \dots + x_n = 0$, so that it's perpendicular to the eigenvector of the smallest eigenvalue.

So now we have a description of the Fiedler vector of the graph: it is the vector $\mathbf x$ that minimizes $\mathbf x^{\mathsf T}L\mathbf x$ subject to $\|\mathbf x\|=1$ and $x_1 + x_2 + \dots + x_n = 0$. To make this more helpful, note that $\mathbf x^{\mathsf T}L\mathbf x$ can be written as $\sum_{ij \in E} (x_i - x_j)^2$.

The conditions $\|\mathbf x\|=1$ and $x_1 + x_2 + \dots + x_n = 0$ tell us that the Fiedler vector has to have "enough" positive and negative components; they can't all be the same. Since we're minimizing $\sum_{ij \in E} (x_i - x_j)^2$, we want to make the components on any two adjacent vertices as close together as possible.

So the Fiedler vector ends up painting the graph in a gradient that goes from positive to negative. Each individual value $x_i$ doesn't mean much by itself. But the relative values do: clusters of vertices that are close together get similar values, and far-apart vertices often get different values.

For the next eigenvector, we will add an additional constraint to our problem: we'll be looking for a vector $\mathbf y$ perpendicular to the Fiedler vector. Essentially, this says that $\mathbf y$ should have similar properties, but be different from the thing we found just now, describing a different feature of the graph.

For example, if our graph has three big and sparsely connected clusters, the Fiedler vector might assign positive values to one cluster and negative values to the other two. The next eigenvector might choose a different cluster to separate from the other two clusters. This distinguishes all the clusters, so the eigenvector after that will have to find some inter-cluster separation...

Misha Lavrov
  • 159,700
  • I was starting to write up a complementary answer talking about the random walk on the graph and how the near-1 eigenvalues of the transition probability matrix correspond to the near-0 eigenvalues of $L$, with the intuition being that these eigenvectors describe "slow probability current" between the region of positive values and negative values. But I got confused because $L$ and $P$ are not actually similar, instead $P$ is similar to the symmetric normalized Laplacian. – Ian Oct 06 '20 at 13:38
  • (Cont.) Do you have any comment to help me fill this gap? Maybe some simple relationship between the eigenvectors of $L$ and those of $D^{-1}L$? – Ian Oct 06 '20 at 13:41
  • @Ian Thanks for the comment. I think $P$ is similar to the symmetric normalized adjacency matrix, not Laplacian :) – user137927 Oct 06 '20 at 14:09
  • Thanks, Misha for your descriptions. It's great, but do you think is there any meaning behind the order of nodes and order of values? For example, if we order the values of Fiedler vector by its values and get the following order of nodes: $3,2,5,4,1$ for a graph with $5$ nodes. Although we can say that the ordering means that $3$ and $2$ are close in the graph, but we shouldn't forget that the value of $3$ is bigger than $2$ so, it means that it's value has more deviation from neighbors. But is there any intuition behind it? – user137927 Oct 06 '20 at 14:13
  • @user137927 It is also similar to the symmetric normalized Laplacian, since $I-A$ is always similar to $A$ for whatever square matrix $A$. What I am fuzzy about is the relationship between the eigendecomposition of the Laplacian and that of the symmetric normalized Laplacian. – Ian Oct 06 '20 at 14:51
  • @user137927 The ordering doesn't necessarily meant that $3$ and $2$ are close; it means that $3$ and $1$ are far. (Vertices that are far apart could still get similar values; the implication only goes one way.) Intuitively, $3$ is "further from $1$" than $2$, not necessarily in terms of distance but also in number of close connections, and that sort of thing. – Misha Lavrov Oct 06 '20 at 14:56
  • @Ian Normalizing the Laplacian or adjacency matrix is equivalent to taking a different inner product: for functions $f, g : V(G) \to \mathbb R$, the new inner product is $\sum_{v \in V(G)} f(v) g(v) \pi_v$, where $\pi_v$ is the stationary distribution of the random walk. So when we talk about each eigenvector being orthogonal to the previous ones, that means that some suitable transformation of the eigenvectors is orthogonal with respect to this inner product. – Misha Lavrov Oct 06 '20 at 14:59
  • @MishaLavrov I see. So I guess my analogy does not really help to justify examining the eigenvectors of the ordinary graph Laplacian. But you do have that the eigenvectors with small eigenvalues of the "random walk normalized Laplacian" correspond to the slowest transient probability currents. For example if you have two copies of $K_N$ and join them by a single edge, then the relevant eigenvector is approximately a constant on one copy of $K_N$ and the negative of that constant on the other copy. – Ian Oct 06 '20 at 15:26
  • @MishaLavrov Thanks for the description. I think I got it. Sorry about your answer to Ian, what do you mean by "Normalizing the Laplacian or adjacency matrix is equivalent to taking a different inner product", can you explain it a little bit more? – user137927 Oct 06 '20 at 16:10
  • 2
    @user137927 When we replace $L$ by $D^{-1/2}LD^{-1/2}$ and then do everything I did in my answer, the vector $\mathbf x$ we get will correspond to a vector $\mathbf y = D^{-1/2}\mathbf x$ that maximizes $\mathbf y^{\mathsf T}L\mathbf y$, just as we did. But where we got the condition $x_1 + x_2 + \dots + x_n = 0$ from being orthogonal to the first eigenvector, the normalized Laplacian will instead give a different condition $d_1 y_1 + d_2 y_2 + \dots + d_n y_n = 0$, which says that $\mathbf y$ is orthogonal to $(1,1,\dots,1)$ in the different inner product I described. – Misha Lavrov Oct 06 '20 at 16:37
  • 2
    Which is notably the same orthogonality condition in the special case of regular graphs (because in this case the inner products are just multiples of each other). – Ian Oct 06 '20 at 16:56
  • Thanks to both of you – user137927 Oct 06 '20 at 18:36
  • I went ahead and wrote up that side answer. Thanks for clearing up my confusion, @MishaLavrov, it helped me to figure out what the right analogy actually was. – Ian Oct 08 '20 at 12:37
4

As a complementary answer, one can think about random walks on $G$. The ordinary random walk on $G$, where at each time step the process makes a transition uniformly at random, is actually not so intimately connected with the ordinary Laplacian matrix $L$ (as was discussed in the comments on Misha Lavrov's answer). It is instead connected with the symmetric normalized Laplacian matrix $L_{sym}$. The matrix $L_{sym}$ is related to the transition probability matrix of the random walk $P$ by the identity $L_{sym}=I-D^{1/2} P D^{-1/2}$. The small nonzero eigenvalues of $L_{sym}$ correspond to near-$1$ eigenvalues of $P$. Because $p(t)=p(0) P^t$, the left eigenvectors of $P$ with eigenvalues near $1$ correspond to "slow probability currents" between the vertices where the eigenvector is positive and the vertices where the eigenvector is negative, with relatively more probability lost or gained at a vertex where the absolute value of the eigenvector is larger.

Just about everything I said above does in fact apply to the Laplacian, once you change the setting. Specifically, the Laplacian is really connected to a continuous time Markov chain on $G$ where each edge is a possible transition, with all edges having unit rate. Thus in this situation the typical time to stay at a high degree vertex is lower than at a low degree vertex (unlike for the ordinary random walk). The generator matrix, say $A$, of this CTMC is the negative of the Laplacian (its diagonal has negative entries), and the probability distribution evolves as $p'=pA$ so $p(t)=p(0) e^{At}=p(0) e^{-Lt}$. So now in this setting the left eigenvectors of $L$ with eigenvalues close to zero (which are the left eigenvectors of the adjacency matrix with eigenvalues close to $1$) generate the slow probability currents.

Ian
  • 104,572
  • usually I see $L_{sym}=I-D^{-1/2} A D^{-1/2}$, where A is an adjacency matrix. Is there a reason why $L_{sym}=I-D^{1/2} P D^{-1/2}$ here? – William Ambrose Mar 07 '22 at 01:18
  • 1
    @WilliamAmbrose Because $P=D^{-1}A$. My $L_{sym}$ is the same as yours. – Ian Mar 07 '22 at 01:27
0

I'd like to add something very trivial and something less trivial to Misha's and Ian's excellent answers.

First, the trivial: If $u$ is an eigenvector to an eigenvalue $\lambda$ and $\alpha$ is an constant, then $\alpha u$ is also an eigenvector to the same eigenvalue.
Put differently: Only the ratio of 2 eigenvector components $u_{ij}/u_{ik}$ is relevant.
Having said that, if the graph Laplacian is symmetric (undirected graphs), then all eigenvalues are real and all eigenvectors can be chosen real. In addition, you may choose the largest absolute eigenvector component to be positive.

Now for the less trivial thing extending on Misha's comment on graph clusters: You may want to look at 'discrete nodal domain theorems'.
A 'nodal domain' is a connected subset of vertices for which the eigenvector does not change sign. Simply put (speaking to the intuition), the 'discrete nodal domain theorems' say that an eigenvector of the $k$-th largest eigenvalue of a graph Laplacian has at most $k$ nodal domains. As I understand it, this 'works' much better in the continuous case (Courant's nodal domain theorem) and it is technically more messy for graphs as nodal domain borders may be below the graphs 'resolution' and vertices may be on the domain border, requiring case differentiation.
I think of this in 2 dimensions as 'Chladni figures', e.g., see https://en.wikipedia.org/wiki/Ernst_Chladni#Chladni_figures.

Let me close with two more comments:

  1. If you have repeated eigenvalues ('eigenvalues with multiplicity'), then you have a choice in the basis of eigenvectors for that eigenspace and a good choice can reflect a symmetry of the set-up, e.g., the eigenvectors of the star graph with non-trivial multiplicity can be chosen to be 0 except for 2 leafs ('rays' in case of the star) and then you can go around the rays to get a nice basis, the eigenvector pairs on a cycle describe left and right moving 'currents'
  2. Finally, high absolute non-zero values of a normalized eigenvector component show 'local activity'; you may want to look at 'eigenvector localization'. Areas with larger vertex degrees will allow higher eigenvector values but I believe, there are more subtle effects ('it's not about how many people you know but who you know') and there is a dependency on the specific choice of the Laplacian (combinatorial and normalized not the same somehow). May add later... does anybody know/ can explain?
Michael T
  • 1,742