*Course:* Math 535 - Mathematical Methods in Data Science (MMiDS)

*Author:* Sebastien Roch, Department of Mathematics, University of Wisconsin-Madison

*Updated:* Sep 28, 2020

*Copyright:* © 2020 Sebastien Roch

We review spectral decompositions and elements of graph theory.

We first review eigenvalues and eigenvectors.

As before, we work on $\mathbb{R}^d$.

**Definition (Eigenvalues and eigenvectors):** Let $A \in \mathbb{R}^{d \times d}$ be a square matrix. Then $\lambda \in \mathbb{R}$ is an eigenvalue of $A$ if there exists a nonzero vector $\mathbf{x} \neq \mathbf{0}$ such that

The vector $\mathbf{x}$ is referred to as an eigenvector. $\lhd$

As the next example shows, not every matrix has an eigenvalue.

**Example (No Real Eigenvalues):** Set $d = 2$ and let

For $\lambda$ to be an eigenvalue, there must be an nonzero eigenvector $\mathbf{x} = (x_1, x_2)^T$ such that

$$ A \mathbf{x} = \lambda \mathbf{x} $$or put differently

$$ - x_2 = \lambda x_1 \quad\text{and}\quad x_1 = \lambda x_2. $$Replacing these equations into each other, it must be that

$$ - x_2 = \lambda^2 x_2 \quad\text{and}\quad x_1 = - \lambda^2 x_1. $$Because $x_1, x_2$ cannot both be $0$, $\lambda$ must satisfy the equation

$$ \lambda^2 = -1 $$for which there is no real solution. $\lhd$

In general, $A \in \mathbb{R}^{d \times d}$ has at most $d$ distinct eigenvalues.

**Lemma (Number of Eigenvalues):** Let $A \in \mathbb{R}^{d \times d}$ and
let $\lambda_1, \ldots, \lambda_m$ be distinct eigenvalues of $A$ with corresponding
nonzero eigenvectors $\mathbf{x}_1, \ldots, \mathbf{x}_m$. Then $\mathbf{x}_1, \ldots, \mathbf{x}_m$ are linearly independent. As a result, $m \leq d$.

*Proof idea:* We use the *Linear Dependence Lemma* and the distinctness of the eigenvalues to derive a contradiction.

*Proof:* Assume by contradiction that $\mathbf{x}_1, \ldots, \mathbf{x}_m$ are linearly dependent. By the *Linear Dependence Lemma*, there is $k \leq m$ such that
$$
\mathbf{x}_k \in \mathrm{span}(\mathbf{x}_1, \ldots, \mathbf{x}_{k-1})
$$

where $\mathbf{x}_1, \ldots, \mathbf{x}_{k-1}$ are linearly independent. In particular, there are $a_1, \ldots, a_{k-1}$ such that

$$ \mathbf{x}_k = a_1 \mathbf{x}_1 + \cdots + a_{k-1} \mathbf{x}_{k-1}. $$Transform the equation above in two ways: (1) multiply both sides by $\lambda_k$ and (2) apply $A$. Then subtract the resulting equations. That leads to

$$ \mathbf{0} = a_1 (\lambda_k - \lambda_1) \mathbf{x}_1 + a_{k-1} (\lambda_k - \lambda_{k-1}) \mathbf{x}_{k-1}. $$Because the $\lambda_i$'s are distinct and $\mathbf{x}_1, \ldots, \mathbf{x}_{k-1}$ are linearly independent, we must have $a_1 = \cdots = a_{k-1} = 0$. But that implies that $\mathbf{x}_k = \mathbf{0}$, a contradiction.

For the second claim, if there were more than $d$ distinct eigenvalues, then there would be more than $d$ corresponding linearly independent eigenvectors by the first claim, a contradiction. $\square$

**Example (Diagonal (and Similar) Matrices):** We use the notation $\mathrm{diag}(\lambda_1,\ldots,\lambda_d)$ for the diagonal matrix with diagonal entries $\lambda_1,\ldots,\lambda_d$. The upper bound in the *Number of Eigenvalues Lemma* can be achieved, for instance, by diagonal matrices with distinct diagonal entries $A = \mathrm{diag}(\lambda_1, \ldots, \lambda_d)$. Each standard basis vector $\mathbf{e}_i$ is then an eigenvector

More generally, let $A$ be similar to a matrix $D = \mathrm{diag}(\lambda_1, \ldots, \lambda_d)$ with distinct diagonal entries, that is, there exists a nonsingular matrix $P$ such that

$$ A = P D P^{-1}. $$Let $\mathbf{p}_1, \ldots, \mathbf{p}_d$ be the columns of $P$. Note that, because the columns of $P$ form a basis of $\mathbb{R}^d$, the entries of the vector $\mathbf{c} = P^{-1} \mathbf{x}$ are the coefficients of the unique linear combination of the $\mathbf{p}_i$'s equal to $\mathbf{x}$. Indeed, $P \mathbf{c} = \mathbf{x}$. Hence, $A \mathbf{x}$ is can be thought of as: (1) expressing $\mathbf{x}$ in the basis $\mathbf{p}_1, \ldots, \mathbf{p}_d$, (2) and scaling the $\mathbf{p}_i$'s by the corresponding $\lambda_i$'s. In particular, the $\mathbf{p}_i$'s are eigenvectors of $A$ since, by the above, $P^{-1} \mathbf{p}_i = \mathbf{e}_i$ and so

$$ A \mathbf{p}_i = P D P^{-1} \mathbf{p}_i = P D \mathbf{e}_i = P (\lambda_i \mathbf{e}_i) = \lambda_i \mathbf{p}_i. $$$\lhd$

**NUMERICAL CORNER** In Julia, the eigenvalues and eigenvectors of a matrix can be computed using `eigen`

.

In [1]:

```
#Julia version: 1.5.1
using LinearAlgebra, LightGraphs, GraphPlot
```

In [2]:

```
A = [2.5 -0.5; -0.5 2.5]
```

Out[2]:

In [3]:

```
eigen(A)
```

Out[3]:

When $A$ is symmetric, that is, $a_{ij} = a_{ji}$ for all $i,j$, a remarkable result is that $A$ is similar to a diagonal matrix by an orthogonal transformation. Put differently, there exists an orthonormal basis of $\mathbb{R}^d$ made of eigenvectors of $A$. We will prove this result later in this topic.

**Theorem (Spectral Theorem):** Let $A \in \mathbb{R}^{d \times d}$ be a symmetric matrix, that is, $A^T = A$. Then $A$ has $d$ orthonormal eigenvectors $\mathbf{q}_1, \ldots, \mathbf{q}_d$ with corresponding (not necessarily distinct) real eigenvalues $\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_d$. In matrix form, this is written as the matrix factorization

where $Q$ has columns $\mathbf{q}_1, \ldots, \mathbf{q}_d$ and $\Lambda = \mathrm{diag}(\lambda_1, \ldots, \lambda_d)$. We refer to this factorization as a spectral decomposition of $A$.

The eigenvalues of a symmetric matrix - while real - may be negative. There is however an important special case where the eigenvalues are nonnegative, positive semidefinite matrices.

**Theorem (Characterization of Positive Semidefiniteness):** Let $A \in \mathbb{R}^{d \times d}$ be a symmetric matrix and let $A = Q \Lambda Q^T$ be a spectral decomposition of $A$ with $\Lambda = \mathrm{diag}(\lambda_1, \ldots, \lambda_d)$. Then $A \succeq 0$ if and only if its eigenvalues $\lambda_1, \ldots, \lambda_d$ are nonnegative.

*Proof:* Assume $A \succeq 0$. Let $\mathbf{q}_i$ be an eigenvector of $A$ with corresponding eigenvalue $\lambda_i$. Then

which must be nonnegative by definition of a positive semidefinite matrix.

In the other direction, assume $\lambda_1, \ldots, \lambda_d \geq 0$. Note that the vector $Q^T \mathbf{x}$ has entries $\langle \mathbf{q}_i, \mathbf{x}\rangle$, $i=1,\ldots, d$. Thus

$$ \langle \mathbf{x}, A \mathbf{x} \rangle = \mathbf{x}^T Q \Lambda Q^T \mathbf{x} = \sum_{i=1}^d \lambda_i \langle \mathbf{q}_i, \mathbf{x}\rangle^2 $$which is necessarily nonnegative. $\square$

We will need some basic notions from graph theory.

We start with undirected graphs.

**Definition (Undirected Graph):** An undirected graph (or graph for short)
is a pair $G = (V,E)$ where $V$ is the set of vertices (or nodes) and

is the set of edges.$\lhd$

**Example (Petersen Graph):** The following example is called the Petersen graph. Here the circles are vertices (of which there are $10$) and the segments connecting them represent edges (of which there are $15$).

(Source)

$\lhd$

We occasionally write $V(G)$ and $E(G)$ for the vertices and edges of the graph $G$. In our case, the set of vertices $V$ is always finite.

**Definition (Incidence and Adjacency):** A vertex $v \in V$ is incident with an edge $e \in E$ if $v \in e$. The incident vertices of an edge are called its endvertices. Two vertices $u,v \in V$ are adjacent (or neighbors), which we denote by $u \sim v$, if $\{u,v\} \in E$. $\lhd$

**Definition (Neighborhood and Degrees):** The set of adjacent vertices of $v$, denoted by $N(v)$, is called the neighborhood of $v$ and its size, i.e., $\delta(v):=|N(v)|$, is the degree of $v$. $\lhd$

A vertex $v$ with $\delta(v) = 0$ is called isolated. A graph is called $d$-regular if all its degrees are $d$.

**Example (Petersen, continued):** Returning to the Petersen graph, all its vertices have degree $3$, that is, it is $3$-regular. In particular there is no isolated vertex. $\lhd$

**Definition (Subgraph):** A subgraph of $G = (V,E)$ is a graph $G' = (V',E')$ with $V' \subseteq V$ and $E' \subseteq E$. The subgraph $G'$ is said
to be induced if

that is, it contains all edges of $G$ between the vertices of $V'$. In that case the notation $G' := G[V']$ is used. $\lhd$

Implicit in this last definition is the fact that the edges in $E'$ are incident only to $V'$.

A subgraph is said to be spanning if $V' = V$. A subgraph containing all possible edges between its vertices is called a complete subgraph or clique.

**Example (Petersen, continued):** The Petersen graph contains no triangle (that is, complete subgraphs with $3$ vertices), induced or not. $\lhd$

**Defintion (Paths):** A path in $G$ is a sequence of (not necessarily distinct)
vertices $x_0 \sim x_1 \sim \cdots \sim x_k$. The number of edges, $k$, is called the length of the path. If the endvertices $x_0$, $x_k$ coincide, that is, $x_0 = x_k$, we call the path a cycle. If the vertices are all distinct (except possibly for the endvertices), we say that the path (or cycle) is self-avoiding. $\lhd$

A self-avoiding path or cycle can be seen as a (not necessarily induced) subgraph of $G$. We write $u \leftrightarrow v$ if there is a path between $u$ and $v$. The relation $\leftrightarrow$ is in fact an equivalence relation. The equivalence classes, that is the maximal subsets $U \subseteq V$ such that $u \leftrightarrow v$ for all $u, v \in U$, are called connected components and form a partition of $V$. The length of the shortest self-avoiding path connecting two distinct vertices $u, v$ is called the graph distance between $u$ and $v$, denoted by $\rho(u,v)$.

**Definition (Connectedness):** A graph is connected if any two of its vertices are linked by a path, that is, if $u \leftrightarrow v$ for all $u, v \in V$. Or put differently, if there is only one connected component. $\lhd$

**Example (Petersen, continued):** The Petersen graph is connected. $\lhd$

A forest is a graph with no self-avoiding cycle. A tree is a connected forest. Vertices of degree $1$ are called leaves. A spanning tree of $G$ is a subgraph which is a tree and is also spanning.

**NUMERICAL CORNER** In Julia, the LightGraphs.jl package provides many functionalities for defining, modifying and plotting graphs. For instance, many standard graphs can be defined conveniently. The `smallgraph`

function is one way to do this. The following defines the Petersen graph.

In [7]:

```
g = smallgraph(:petersen)
```

Out[7]:

This graph can be plotted using the function `gplot`

of the GraphPlot.jl package.

In [8]:

```
gplot(g)
```

Out[8]:

Other standard graphs can be generated with special functions, e.g. complete graphs using `complete_graph`

. See here for a complete list.

In [9]:

```
g = complete_graph(3)
```

Out[9]:

In [10]:

```
gplot(g)
```

Out[10]:

In [11]:

```
g = path_graph(10)
gplot(g)
```

Out[11]:

In [12]:

```
nv(g) # number of vertices
```

Out[12]:

In [13]:

```
ne(g) # number of edges
```

Out[13]:

In [14]:

```
has_vertex(g, 8) # checks whether the graph has a particular vertex
```

Out[14]:

In [15]:

```
has_vertex(g, 11)
```

Out[15]:

In [16]:

```
has_edge(g, 1, 2) # checks whether the graph has a particular vertex
```

Out[16]:

In [17]:

```
has_edge(g, 1, 3)
```

Out[17]:

In [18]:

```
neighbors(g, 3) # returns a list of neighbors of the specified vertex
```

Out[18]:

In [19]:

```
is_connected(g) # checks whether the graph is connected
```

Out[19]:

In [20]:

```
connected_components(g) # returns the connected components
```

Out[20]:

In [21]:

```
for e in edges(g) # iterate over all edges of a graph
@show src(e), dst(e) # prints source and destination vertices of the edge
end
```

Another way of specifying a dgraph is to start with an empty graph with a given number of vertices and then add edges one by one. The following command creates a graph with $4$ vertices and no edge (see `SimpleGraph`

).

In [22]:

```
g = SimpleGraph(4)
```

Out[22]:

In [23]:

```
add_edge!(g, 1, 2)
add_edge!(g, 3, 4)
add_edge!(g, 1, 4)
add_edge!(g, 4, 1);
```

In [24]:

```
label = 1:nv(g)
gplot(g, nodelabel=label)
```

Out[24]:

We will also need directed graphs.

**Definition (Directed Graph):** A directed graph (or digraph for short) is a pair $G = (V,E)$ where $V$ is a set of vertices (or nodes) and $E \subseteq V^2$ is a set of directed edges (or arcs). $\lhd$

Note that, unlike the undirected case, in a digraph the edges are ordered pairs - which is taken to mean that they have an orientation. If $e = (i,j) \in E$ is an edge in a digraph $G = (V,E)$, then $i$ is called the source of $e$ and $j$ is the destination.

**Example:** Here is an example:

(Source)

Note that a pair of vertices can have arcs between them in both directions simultaneously. $\lhd$

The definitions discussed in the undirected case can be adapted to the directed case.

**Definition (Directed Path):** A directed path is a sequence of vertices $x_0, \ldots, x_k$ with $(x_{i-1},x_i) \in E$ for all $i=1,\ldots,k$. We write $u \to v$ if there is such a path with $x_0 = u$ and $x_k = v$. $\lhd$

**Definition (Communication):** We say that $u,v \in V$ communicate, which we denote by $u \leftrightarrow v$, if $u \to v$ and $v \to u$. The $\leftrightarrow$
relation is again an equivalence relation. The equivalence classes of $\leftrightarrow$ are called the strongly connected components of $G$. $\lhd$

**NUMERICAL CORNER** The package LightGraphs.jl also supports digraphs.

In [25]:

```
g = star_digraph(5)
```

Out[25]:

In [26]:

```
gplot(g)
```

Out[26]:

Another way of specifying a digraph is to start with an empty graph with a given number of vertices and then add edges one by one (compare to the undirected case above). The following command creates a graph with $4$ vertices and no edge (see `SimpleDiGraph`

).

In [27]:

```
g = SimpleDiGraph(4)
```

Out[27]:

In [28]:

```
add_edge!(g, 1, 2)
add_edge!(g, 3, 4)
add_edge!(g, 1, 4)
add_edge!(g, 4, 1);
```

In [29]:

```
label = 1:nv(g)
gplot(g, nodelabel=label)
```

Out[29]: