Course: Math 535 - Mathematical Methods in Data Science (MMiDS)
Author: Sebastien Roch, Department of Mathematics, University of Wisconsin-Madison
Updated: Sep 28, 2020
Copyright: © 2020 Sebastien Roch
We first review eigenvalues and eigenvectors.
As before, we work on $\mathbb{R}^d$.
Definition (Eigenvalues and eigenvectors): Let $A \in \mathbb{R}^{d \times d}$ be a square matrix. Then $\lambda \in \mathbb{R}$ is an eigenvalue of $A$ if there exists a nonzero vector $\mathbf{x} \neq \mathbf{0}$ such that
$$ A \mathbf{x} = \lambda \mathbf{x}. $$The vector $\mathbf{x}$ is referred to as an eigenvector. $\lhd$
A = [2.5 -0.5; -0.5 2.5]
2×2 Array{Float64,2}: 2.5 -0.5 -0.5 2.5
eigen(A)
Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}} values: 2-element Array{Float64,1}: 2.0 3.0 vectors: 2×2 Array{Float64,2}: -0.707107 -0.707107 -0.707107 0.707107
When $A$ is symmetric, that is, $a_{ij} = a_{ji}$ for all $i,j$, a remarkable result is that $A$ is similar to a diagonal matrix by an orthogonal transformation. Put differently, there exists an orthonormal basis of $\mathbb{R}^d$ made of eigenvectors of $A$. We will prove this result later in this topic.
Theorem (Spectral Theorem): Let $A \in \mathbb{R}^{d \times d}$ be a symmetric matrix, that is, $A^T = A$. Then $A$ has $d$ orthonormal eigenvectors $\mathbf{q}_1, \ldots, \mathbf{q}_d$ with corresponding (not necessarily distinct) real eigenvalues $\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_d$. In matrix form, this is written as the matrix factorization
$$ A = Q \Lambda Q^T = \sum_{i=1}^d \lambda_i \mathbf{q}_i \mathbf{q}_i^T $$where $Q$ has columns $\mathbf{q}_1, \ldots, \mathbf{q}_d$ and $\Lambda = \mathrm{diag}(\lambda_1, \ldots, \lambda_d)$. We refer to this factorization as a spectral decomposition of $A$.
Theorem (Characterization of Positive Semidefiniteness): Let $A \in \mathbb{R}^{d \times d}$ be a symmetric matrix and let $A = Q \Lambda Q^T$ be a spectral decomposition of $A$ with $\Lambda = \mathrm{diag}(\lambda_1, \ldots, \lambda_d)$. Then $A \succeq 0$ if and only if its eigenvalues $\lambda_1, \ldots, \lambda_d$ are nonnegative.
Proof: Assume $A \succeq 0$. Let $\mathbf{q}_i$ be an eigenvector of $A$ with corresponding eigenvalue $\lambda_i$. Then
$$ \langle \mathbf{q}_i, A \mathbf{q}_i \rangle = \langle \mathbf{q}_i, \lambda_i \mathbf{q}_i \rangle = \lambda_i $$which must be nonnegative by definition of a positive semidefinite matrix.
In the other direction, assume $\lambda_1, \ldots, \lambda_d \geq 0$. Note that the vector $Q^T \mathbf{x}$ has entries $\langle \mathbf{q}_i, \mathbf{x}\rangle$, $i=1,\ldots, d$. Thus
$$ \langle \mathbf{x}, A \mathbf{x} \rangle = \mathbf{x}^T Q \Lambda Q^T \mathbf{x} = \sum_{i=1}^d \lambda_i \langle \mathbf{q}_i, \mathbf{x}\rangle^2 $$which is necessarily nonnegative. $\square$
We will need some basic notions from graph theory.
We start with undirected graphs.
Definition (Undirected Graph): An undirected graph (or graph for short) is a pair $G = (V,E)$ where $V$ is the set of vertices (or nodes) and
$$ E \subseteq \{\{u,v\}\,:\, u,v \in V\} $$is the set of edges.$\lhd$
Example (Petersen Graph): The following example is called the Petersen graph. Here the circles are vertices (of which there are $10$) and the segments connecting them represent edges (of which there are $15$).
Definition (Incidence and Adjacency): A vertex $v \in V$ is incident with an edge $e \in E$ if $v \in e$. The incident vertices of an edge are called its endvertices. Two vertices $u,v \in V$ are adjacent (or neighbors), which we denote by $u \sim v$, if $\{u,v\} \in E$. $\lhd$
Definition (Neighborhood and Degrees): The set of adjacent vertices of $v$, denoted by $N(v)$, is called the neighborhood of $v$ and its size, i.e., $\delta(v):=|N(v)|$, is the degree of $v$. $\lhd$
Definition (Subgraph): A subgraph of $G = (V,E)$ is a graph $G' = (V',E')$ with $V' \subseteq V$ and $E' \subseteq E$. The subgraph $G'$ is said to be induced if
$$ E' = \{\{x,y\}\,:\, x,y \in V',\ \{x,y\}\in E\}, $$that is, it contains all edges of $G$ between the vertices of $V'$. In that case the notation $G' := G[V']$ is used. $\lhd$
Defintion (Paths): A path in $G$ is a sequence of (not necessarily distinct) vertices $x_0 \sim x_1 \sim \cdots \sim x_k$. The number of edges, $k$, is called the length of the path. If the endvertices $x_0$, $x_k$ coincide, that is, $x_0 = x_k$, we call the path a cycle. If the vertices are all distinct (except possibly for the endvertices), we say that the path (or cycle) is self-avoiding. $\lhd$
A self-avoiding path or cycle can be seen as a (not necessarily induced) subgraph of $G$. We write $u \leftrightarrow v$ if there is a path between $u$ and $v$. The relation $\leftrightarrow$ is in fact an equivalence relation. The equivalence classes, that is the maximal subsets $U \subseteq V$ such that $u \leftrightarrow v$ for all $u, v \in U$, are called connected components and form a partition of $V$. The length of the shortest self-avoiding path connecting two distinct vertices $u, v$ is called the graph distance between $u$ and $v$, denoted by $\rho(u,v)$.
Definition (Connectedness): A graph is connected if any two of its vertices are linked by a path, that is, if $u \leftrightarrow v$ for all $u, v \in V$. Or put differently, if there is only one connected component. $\lhd$
g = smallgraph(:petersen)
{10, 15} undirected simple Int64 graph
gplot(g)
g = path_graph(10)
gplot(g)
nv(g) # number of vertices
10
ne(g) # number of edges
9
for e in edges(g) # iterate over all edges of a graph
@show src(e), dst(e) # prints source and destination vertices of the edge
end
(src(e), dst(e)) = (1, 2) (src(e), dst(e)) = (2, 3) (src(e), dst(e)) = (3, 4) (src(e), dst(e)) = (4, 5) (src(e), dst(e)) = (5, 6) (src(e), dst(e)) = (6, 7) (src(e), dst(e)) = (7, 8) (src(e), dst(e)) = (8, 9) (src(e), dst(e)) = (9, 10)
g = SimpleGraph(4)
{4, 0} undirected simple Int64 graph
add_edge!(g, 1, 2)
add_edge!(g, 3, 4)
add_edge!(g, 1, 4)
add_edge!(g, 4, 1);
label = 1:nv(g)
gplot(g, nodelabel=label)