*Course:* Math 535 - Mathematical Methods in Data Science (MMiDS)

*Author:* Sebastien Roch, Department of Mathematics, University of Wisconsin-Madison

*Updated:* Sep 21, 2020

*Copyright:* © 2020 Sebastien Roch

We first generalize the extremal characterization used in the previous section.

**Definition (Rayleigh Quotient):** Let $A \in \mathbb{R}^{d \times d}$ be a symmetric matrix. The Rayleigh quotient is defined as

which is defined for any $\mathbf{u} \neq \mathbf{0}$ in $\mathbb{R}^{d}$. $\lhd$

**Theorem (Courant-Fischer):** Let $A \in \mathbb{R}^{d \times d}$ be a symmetric matrix with spectral decomposition $A = \sum_{i=1}^d \lambda_i \mathbf{v}_i \mathbf{v}_i^T$ where $\lambda_1 \geq \cdots \geq \lambda_d$. For each $k = 1,\ldots,d$, define the subspace

Then, for all $k = 1,\ldots,d$,

$$ \lambda_k = \min_{\mathbf{u} \in \mathcal{V}_k} \mathcal{R}_A(\mathbf{u}) = \max_{\mathbf{u} \in \mathcal{W}_{d-k+1}} \mathcal{R}_A(\mathbf{u}). $$Furthermore we have the following min-max formulas, which do not depend on the choice of spectral decomposition, for all $k = 1,\ldots,d$

$$ \lambda_k = \max_{\mathrm{dim}(\mathcal{V}) = k} \min_{\mathbf{u} \in \mathcal{V}} \mathcal{R}_A(\mathbf{u}) = \min_{\mathrm{dim}(\mathcal{W}) = d-k+1} \max_{\mathbf{u} \in \mathcal{W}} \mathcal{R}_A(\mathbf{u}). $$*Proof idea:* For the local formula, we expand a vector in $\mathcal{V}_k$ into the basis $\mathbf{v}_1,\ldots,\mathbf{v}_k$ and use the fact that $\mathcal{R}_A(\mathbf{v}_i) = \lambda_i$ and that eigenvalues are in non-increasing order. The global formulas then follow from a dimension argument.

We record one important special case of Courant-Fischer for the Laplacian.

**Corollary (Extremal Characterization of $\mu_2$):** Let $G = (V, E)$ be a graph with $n = |V|$ vertices. Assume the Laplacian $L$ of $G$ has spectral decomposition $L = \sum_{i=1}^n \mu_i \mathbf{y}_i \mathbf{y}_i^T$ with $0 = \mu_1 \leq \mu_2 \leq \cdots \leq \mu_n$ and $\mathbf{y}_1 = \frac{1}{\sqrt{n}}(1,\ldots,1)^T$. Then

One application of this extremal characterization is the graph drawing heuristic we described previously. Consider the entries of the second Laplacian eigenvector $\mathbf{y}_2$ normalized to have unit norm. The entries are centered around $0$ by the condition $\sum_{u=1}^n x_{u} = 0$. Because it minimizes the quantity

$$ \frac{\sum_{\{u, v\} \in E} (x_u - x_v)^2}{\sum_{u = 1}^n x_u^2} $$over all centered unit vectors, $\mathbf{y}_2$ tends to assign similar coordinates to adjacent vertices. A similar reasoning applies to the third Laplacian eigenvector, which in addition is orthogonal to the second one.

In [2]:

```
n = 10
g = path_graph(n)
```

Out[2]:

{10, 9} undirected simple Int64 graph

In [3]:

```
gplot(g, nodelabel=1:n)
```

Out[3]:

In [4]:

```
L = Array(laplacian_matrix(g))
F = eigen(L)
v = F.vectors
plot(v[:,2], legend=false, marker=:diamond)
```

Out[4]:

Let $G=(V, E)$ be a graph. Imagine that we are interested in finding a good cut. That is, roughly speaking, we seek to divide it into two disjoint subsets of vertices to achieve two goals simultaneously:

- the two sets have relatively few edges between them
- neither set is too small.

We will show that the Laplacian eigenvectors provide useful information in order to perform this kind of graph cutting. First we formulate the problem formally.

**Definition (Isoperimetric Number):** Let $G=(V, E)$ be a graph. A cut is a bipartition $(S, S^c)$ of the vertices of $G$, where $S$ and $S^c$ are non-empty subsets of $V$. For short, we refer to $\emptyset \neq S \subset V$ as a cut. The size of the cut, $|\partial S|$, is the number of edges from $S$ to $S^c$, where the edge boundary is defined as

The cut ratio of $S$ is

$$ \phi(S) = \frac{|\partial S|}{\min\{|S|, |S^c|\}} $$and the isoperimetric number (or Cheeger constant) of $G$ is the minimum cut ratio over all of its cuts, that is,

$$ \phi_G = \min_{\emptyset \neq S \subset V} \phi(S). $$$\lhd$

**Theorem (Cheeger Inequality):** Let $G = (V, E)$ be a graph with $n = |V|$ vertices and maximum degree $\bar{\delta}$. Let $0 = \mu_1 \leq \mu_2 \leq \cdots \leq \mu_n$ be its Laplacian eigenvalues. Then

*Proof idea:* We find an appropriate test vector to plug into the extremal characterization of $\mu_2$ and link it to $\phi_G$.

** Constructing a good test vector:** We construct an $\mathbf{x}$ that provides a good upper bound. Let $\emptyset \neq S \subset V$ be a proper, nonempty subset of $V$ such that $0 < |S| \leq \frac{1}{2}|V|$. Consider the vector $\mathbf{x} \in \mathbb{R}^n$ with entries

This choice ensures that

$$ \sum_{i=1}^n x_i = \sum_{i\in S^c} |S| + \sum_{i\in S} (- |S^c|) = |S^c| |S| - |S| |S^c| = 0. $$and

$$ \begin{align} \sum_{i=1}^n x_i^2 &= \sum_{i\in S^c} |S|^2 + \sum_{i\in S} (- |S^c|)^2\\ &= |S^c| |S|^2 + |S| |S^c|^2\\ &= |S^c| |S| (|S| + |S^c|)\\ &= n |S^c| |S|. \end{align} $$So

$$ \sum_{\{i, j\} \in E} (x_i - x_j)^2 = \sum_{\{i, j\} \in E} (|S| + |S^c|)^2 = n^2 |\partial S| $$where we used that, for each edge $\{i, j\} \in E$, one endvertex is in $S$ and one endvertex is in $S^c$. Hence $x_i - x_j$ is either $|S| + |S^c|$ or $-|S| - |S^c|$.

** Using the definition of the isoperimetric number:** So for this choice of $\mathbf{x}$ we have

where we used that $|S^c| \geq n/2$. This inequality holds for any $S$ such that $0 < |S| \leq \frac{1}{2}|V|$. In particular, it holds for the $S$ producing the smallest value. Hence, by the definition of the isoperimetric number (see the exercise above), we get

$$ \mu_2 \leq 2 \phi_G $$as claimed. $\square$

** An algorithm:** Let $\mathbf{y}_2 \in \mathbb{R}^n$ be the unit-norm eigenvector of $L$ associated to $\mu_2$. There is one entry of $\mathbf{y}_2 = (y_{2,1}, \ldots, y_{2,n})^T$ for each vertex of $G$. We use these entries to embed the graph $G$ in $\mathbb{R}$: vertex $i$ is mapped to $y_{2,i}$. Now order the entries $y_{2,\pi(1)}, \ldots, y_{2,\pi(n)}$, where $\pi$ is a permutation, that is, a re-ordering of $1,\ldots,n$. Specifically, $\pi(1)$ is the vertex corresponding to the smallest entry of $\mathbf{y}_{2}$, $\pi(2)$ is the second smallest, and so on. We consider only cuts of the form

and we choose the $k \leq n/2$ that minimizes the cut ratio

$$ \phi(S_k) = \frac{|\partial S_k|}{|S_k|}. $$What can be proved rigorously is that there exists some $k$ such that

$$ \sum_{\{u, v\} \in E} (y_{2,u} - y_{2,v})^2 \geq \frac{\phi(S_k)^2}{2 \bar{\delta}} $$which implies the Cheeger Inequality.

In [5]:

```
function cut_ratio(A, order, k, n)
edge_boundary = 0 # initialize size of edge boundary
for i=1:k # for all vertices before cut
for j=k+1:n # for all vertices after cut
edge_boundary += A[order[i],order[j]] # add one if {i,j} in E
end
end
denominator = min(k, n-k)
return edge_boundary/denominator
end
```

Out[5]:

cut_ratio (generic function with 1 method)

In [6]:

```
function spectral_cut2(A)
n = size(A,1) # number of vertices
L = Diagonal(reshape(sum(A,dims=2), n)) - A # construct Laplacian
v = eigvecs(L) # eigenvectors of L
order = sortperm(v[:,2]) # index of entries in increasing order
phi = zeros(Float64, n-1) # initialize cut ratios
for k=1:n-1
phi[k] = cut_ratio(A, order, k, n)
end
(phimin, imin) = findmin(phi) # find best cut ratio
return order[1:imin], order[imin+1:end]
end
```

Out[6]:

spectral_cut2 (generic function with 1 method)

In [7]:

```
n = 10
g = path_graph(n)
gplot(g, nodelabel=1:n)
```

Out[7]:

In [8]:

```
A = Array(adjacency_matrix(g))
degrees = reshape(sum(A,dims=2), n)
D = Diagonal(degrees)
L = D - A
```

Out[8]:

10Ã—10 Array{Int64,2}: 1 -1 0 0 0 0 0 0 0 0 -1 2 -1 0 0 0 0 0 0 0 0 -1 2 -1 0 0 0 0 0 0 0 0 -1 2 -1 0 0 0 0 0 0 0 0 -1 2 -1 0 0 0 0 0 0 0 0 -1 2 -1 0 0 0 0 0 0 0 0 -1 2 -1 0 0 0 0 0 0 0 0 -1 2 -1 0 0 0 0 0 0 0 0 -1 2 -1 0 0 0 0 0 0 0 0 -1 1

In [9]:

```
F = eigen(L)
lambda = F.values
plot(lambda, legend=false, marker=:diamond)
```

Out[9]:

In [10]:

```
v = F.vectors
scatter(v[:,2], zeros(n), legend=false, ylims=(-1,1))
```

Out[10]:

In [11]:

```
(s, sc) = spectral_cut2(A)
```

Out[11]:

([1, 2, 3, 4, 5], [6, 7, 8, 9, 10])

In [12]:

```
function viz_cut(g, s; layout=spectral_layout)
n = nv(g)
assign = ones(Int64, n)
assign[s] .= 2
colors = [colorant"lightseagreen", colorant"orange"]
nodecolor = colors[assign]
gplot(g, nodelabel=1:n, nodefillc=nodecolor, layout=layout)
end
```

Out[12]:

viz_cut (generic function with 1 method)

In [14]:

```
g = LightGraphs.SimpleGraphs.grid([4, 7])
A = Array(adjacency_matrix(g))
(s, sc) = spectral_cut2(A)
viz_cut(g, s)
```

Out[14]: