TOPIC 2

Spectral and singular value decompositions

1 Background: review of eigenvalues; graph-theoretic definitions


Course: Math 535 - Mathematical Methods in Data Science (MMiDS)
Author: Sebastien Roch, Department of Mathematics, University of Wisconsin-Madison
Updated: Sep 28, 2020
Copyright: © 2020 Sebastien Roch


1.1 Spectral decomposition

We first review eigenvalues and eigenvectors.

1.1.1 Eigenvalues and eigenvectors

As before, we work on $\mathbb{R}^d$.

Definition (Eigenvalues and eigenvectors): Let $A \in \mathbb{R}^{d \times d}$ be a square matrix. Then $\lambda \in \mathbb{R}$ is an eigenvalue of $A$ if there exists a nonzero vector $\mathbf{x} \neq \mathbf{0}$ such that

$$ A \mathbf{x} = \lambda \mathbf{x}. $$

The vector $\mathbf{x}$ is referred to as an eigenvector. $\lhd$

In [2]:
A = [2.5 -0.5; -0.5 2.5]
Out[2]:
2×2 Array{Float64,2}:
  2.5  -0.5
 -0.5   2.5
In [3]:
eigen(A)
Out[3]:
Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}
values:
2-element Array{Float64,1}:
 2.0
 3.0
vectors:
2×2 Array{Float64,2}:
 -0.707107  -0.707107
 -0.707107   0.707107

1.1.2 Spectral theorem

When $A$ is symmetric, that is, $a_{ij} = a_{ji}$ for all $i,j$, a remarkable result is that $A$ is similar to a diagonal matrix by an orthogonal transformation. Put differently, there exists an orthonormal basis of $\mathbb{R}^d$ made of eigenvectors of $A$. We will prove this result later in this topic.


Theorem (Spectral Theorem): Let $A \in \mathbb{R}^{d \times d}$ be a symmetric matrix, that is, $A^T = A$. Then $A$ has $d$ orthonormal eigenvectors $\mathbf{q}_1, \ldots, \mathbf{q}_d$ with corresponding (not necessarily distinct) real eigenvalues $\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_d$. In matrix form, this is written as the matrix factorization

$$ A = Q \Lambda Q^T = \sum_{i=1}^d \lambda_i \mathbf{q}_i \mathbf{q}_i^T $$

where $Q$ has columns $\mathbf{q}_1, \ldots, \mathbf{q}_d$ and $\Lambda = \mathrm{diag}(\lambda_1, \ldots, \lambda_d)$. We refer to this factorization as a spectral decomposition of $A$.


1.1.3 Positive semidefinite matrices


Theorem (Characterization of Positive Semidefiniteness): Let $A \in \mathbb{R}^{d \times d}$ be a symmetric matrix and let $A = Q \Lambda Q^T$ be a spectral decomposition of $A$ with $\Lambda = \mathrm{diag}(\lambda_1, \ldots, \lambda_d)$. Then $A \succeq 0$ if and only if its eigenvalues $\lambda_1, \ldots, \lambda_d$ are nonnegative.


Proof: Assume $A \succeq 0$. Let $\mathbf{q}_i$ be an eigenvector of $A$ with corresponding eigenvalue $\lambda_i$. Then

$$ \langle \mathbf{q}_i, A \mathbf{q}_i \rangle = \langle \mathbf{q}_i, \lambda_i \mathbf{q}_i \rangle = \lambda_i $$

which must be nonnegative by definition of a positive semidefinite matrix.

In the other direction, assume $\lambda_1, \ldots, \lambda_d \geq 0$. Note that the vector $Q^T \mathbf{x}$ has entries $\langle \mathbf{q}_i, \mathbf{x}\rangle$, $i=1,\ldots, d$. Thus

$$ \langle \mathbf{x}, A \mathbf{x} \rangle = \mathbf{x}^T Q \Lambda Q^T \mathbf{x} = \sum_{i=1}^d \lambda_i \langle \mathbf{q}_i, \mathbf{x}\rangle^2 $$

which is necessarily nonnegative. $\square$

1.2 Elements of graph theory

We will need some basic notions from graph theory.

1.2.1 Basic definitions

We start with undirected graphs.

Definition (Undirected Graph): An undirected graph (or graph for short) is a pair $G = (V,E)$ where $V$ is the set of vertices (or nodes) and

$$ E \subseteq \{\{u,v\}\,:\, u,v \in V\} $$

is the set of edges.$\lhd$

Example (Petersen Graph): The following example is called the Petersen graph. Here the circles are vertices (of which there are $10$) and the segments connecting them represent edges (of which there are $15$).

Petersen Graph

(Source)

$\lhd$

Definition (Incidence and Adjacency): A vertex $v \in V$ is incident with an edge $e \in E$ if $v \in e$. The incident vertices of an edge are called its endvertices. Two vertices $u,v \in V$ are adjacent (or neighbors), which we denote by $u \sim v$, if $\{u,v\} \in E$. $\lhd$

Definition (Neighborhood and Degrees): The set of adjacent vertices of $v$, denoted by $N(v)$, is called the neighborhood of $v$ and its size, i.e., $\delta(v):=|N(v)|$, is the degree of $v$. $\lhd$

Definition (Subgraph): A subgraph of $G = (V,E)$ is a graph $G' = (V',E')$ with $V' \subseteq V$ and $E' \subseteq E$. The subgraph $G'$ is said to be induced if

$$ E' = \{\{x,y\}\,:\, x,y \in V',\ \{x,y\}\in E\}, $$

that is, it contains all edges of $G$ between the vertices of $V'$. In that case the notation $G' := G[V']$ is used. $\lhd$

Defintion (Paths): A path in $G$ is a sequence of (not necessarily distinct) vertices $x_0 \sim x_1 \sim \cdots \sim x_k$. The number of edges, $k$, is called the length of the path. If the endvertices $x_0$, $x_k$ coincide, that is, $x_0 = x_k$, we call the path a cycle. If the vertices are all distinct (except possibly for the endvertices), we say that the path (or cycle) is self-avoiding. $\lhd$

A self-avoiding path or cycle can be seen as a (not necessarily induced) subgraph of $G$. We write $u \leftrightarrow v$ if there is a path between $u$ and $v$. The relation $\leftrightarrow$ is in fact an equivalence relation. The equivalence classes, that is the maximal subsets $U \subseteq V$ such that $u \leftrightarrow v$ for all $u, v \in U$, are called connected components and form a partition of $V$. The length of the shortest self-avoiding path connecting two distinct vertices $u, v$ is called the graph distance between $u$ and $v$, denoted by $\rho(u,v)$.

Definition (Connectedness): A graph is connected if any two of its vertices are linked by a path, that is, if $u \leftrightarrow v$ for all $u, v \in V$. Or put differently, if there is only one connected component. $\lhd$

Example: The following graph has three connected components:

Connected components

(Source)

$\lhd$

In [7]:
g = smallgraph(:petersen)
Out[7]:
{10, 15} undirected simple Int64 graph
In [8]:
gplot(g)
Out[8]:
In [11]:
g = path_graph(10)
gplot(g)
Out[11]:
In [12]:
nv(g) # number of vertices
Out[12]:
10
In [13]:
ne(g) # number of edges
Out[13]:
9
In [21]:
for e in edges(g) # iterate over all edges of a graph
    @show src(e), dst(e) # prints source and destination vertices of the edge
end
(src(e), dst(e)) = (1, 2)
(src(e), dst(e)) = (2, 3)
(src(e), dst(e)) = (3, 4)
(src(e), dst(e)) = (4, 5)
(src(e), dst(e)) = (5, 6)
(src(e), dst(e)) = (6, 7)
(src(e), dst(e)) = (7, 8)
(src(e), dst(e)) = (8, 9)
(src(e), dst(e)) = (9, 10)
In [22]:
g = SimpleGraph(4)
Out[22]:
{4, 0} undirected simple Int64 graph
In [23]:
add_edge!(g, 1, 2)
add_edge!(g, 3, 4)
add_edge!(g, 1, 4)
add_edge!(g, 4, 1);
In [24]:
label = 1:nv(g)
gplot(g, nodelabel=label)
Out[24]:
1 2 3 4