Eigenvalues and Eigenvectors

Basic definition and theorems

Definition. Let $V$ be a vector space and $A:V\to V$ a linear transformation. If $v\in V$ and $\lambda\in\F$ satisfy

$Av=\lambda v, \quad v\neq0$

then $v$ is called an eigenvector of $A$ and $\lambda$ is called an eigenvalue of $A$ .

Theorem. If $A:V\to V$ is a linear transformation then

$v\in V$ is an eigenvector of $A$ with eigenvalue $\lambda$ $\iff$ $v\neq0$ and $v\in N(A-\lambda I)$
$\lambda\in\F$ is an eigenvalue of $A$ if and only if $N(A-\lambda I)\neq \{0\}$
$\lambda\in\F$ is an eigenvalue of $A$ if and only if $\det(A-\lambda I)=0$

This theorem gives a method of finding all eigenvalues&vectors, namely:

Compute the characteristic polynomial $\det(A-\lambda I)$
Solve $\det(A-\lambda I)=0$ for $\lambda$ . Let $\lambda_1$ , …, $\lambda_k$ be all the solutions you find
For each $i=1, \dots, k$ find all the vectors in $N(A-\lambda_i I)$ , i.e. solve $Av-\lambda_i v=0$ for $v$ .

Independence of Eigenvectors Theorem. If $v_1, \dots, v_m\in V$ are eigenvectors of $A:V\to V$ with eigenvalues $\lambda_1, \dots, \lambda_m$ , and if the eigenvalues are distinct (i.e. $\lambda_i\neq \lambda_j$ for all $i\neq j$ ) then $\{v_1, \dots, v_m\}$ is linearly independent.

Proof. Suppose $\{v_1, \dots, v_m\}$ is dependent, i.e.~there are numbers $c_1, \dots, c_m\in\F$ for which $c_1v_1+\cdots +c_mv_m=0$ , while not all $c_i$ vanish. After renumbering the terms we may assume that $c_1\neq 0$ .

Consider the linear transformation

$B = (A-\lambda_2 I)(A-\lambda_3 I)\cdots(A-\lambda_m I)$

Then $B v_i=0$ for all $i=2, 3, \dots, m$ , while

$Bv_1 = (\lambda_1-\lambda_2)(\lambda_1-\lambda_3)\cdots(\lambda_1-\lambda_m) v_1 \neq 0.$

Applying $B$ to both sides in the equation $c_1v_1+\cdots +c_mv_m=0$ , we get

$\begin{aligned} 0&=B\left(c_1v_1+c_2v_2+\cdots +c_mv_m\right)\\ &=c_1Bv_1+c_2Bv_2+\cdots +c_mBv_m\\ &=c_1(\lambda_1-\lambda_2)(\lambda_1-\lambda_3)\cdots(\lambda_1-\lambda_m) v_1 . \end{aligned}$

Since $v_1\neq 0$ , and since $\lambda_1-\lambda_i\neq 0$ for all $i\geq 2$ , it follows that $c_1 =0$ .

The use of eigenvalues and vectors

If a vector $v\in V$ is known as a linear combination of eigenvectors of $A$ then it is easy to compute $Av$ , $A^2v$ , etc. If

$v=a_1v_1+\cdots+a_kv_k$

where $Av_i=\lambda_iv_i$ , then

$\begin{gathered} Av = \lambda_1a_1v_1 + \cdots + \lambda_k a_kv_k\\ A^2v = \lambda_1^2a_1v_1 + \cdots + \lambda_k^2 a_kv_k\\ \vdots\\ A^\ell v = \lambda_1^\ell a_1v_1 + \cdots + \lambda_k^\ell a_kv_k \end{gathered}$

for any $\ell\in\N$ . If none of the eigenvalues $\lambda_i$ vanishes then one solution of $Aw=v$ is given by

$w = \frac{a_1}{\lambda_1}v_1 + \cdots + \frac{a_k}{\lambda_k}v_k.$

For this to be useful we need to find as many eigenvectors as we can.

Finding eigenvalues/vectors for $A:\F^n\to\F^n$

The characteristic polynomial of $A:\F^n\to \F^n$ is, by definition,

$\det(A-\lambda I) = \deter a_{11}-\lambda & a_{12} & \cdots & a_{1n} \\ a_{21}& a_{22}-\lambda & \cdots & a_{2n} \\ & & \ddots & \\ a_{11} & a_{12} & \cdots & a_{nn}-\lambda \minant$

After expanding, this turns out to be a polynomial in $\lambda$ of degree $n$

$\det(A-\lambda I) = (-\lambda)^n + c_1 (-\lambda)^{n-1} + c_2 (-\lambda)^{n-2} + \cdots + c_{n-1}(-\lambda) + c_n.$

Here

$c_1 = a_{11}+a_{22}+\cdots +a_{nn}$

is called the trace of the matrix $A$ , and

$c_{n} = \det A$

is its determinant.

Examples


Some $2\times 2$ matrix	$A = \mat 2 & -2 \\ 2& 7\rix$	The Fibonacci matrix	$F= \mat 1 & 1 \\ 1& 0\rix$
Rotation by $90^\circ$	$R = \mat 0 & -1 \\ 1 & 0\rix$	Reflection	$S = \mat 0 & 1\\ 1& 0\rix$
Projection	$P=\mat 1/2 &1/2 \\ 1/2 &1/2 \rix$	3D rotation	$R=\tmat 0&1&0\\0&0&1\\1&0&0\trix$
Shear	$Z=\mat 1 & 1 \\ 0 & 1 \rix$

Diagonalization

Since the characteristic polynomial $\det(A-\lambda I)$ is a polynomial of degree $n$ , it has at most $n$ zeroes. If it has $n$ distinct zeroes, $\lambda_1$ , … , $\lambda_n$ then we choose an eigenvector $v_i$ for each eigenvalue $\lambda_i$ . The vectors $v_1, \dots, v_n$ are linearly independent in $\F^n$ and therefore they form a basis for $\F^n$ .

Diagonalization Theorem (version 1). If a linear transformation $A:V\to V$ has a basis $v_1, \dots, v_n\in V$ of eigenvectors, with corresponding eigenvalues $\lambda_1$ , … , $\lambda_n$ , then the matrix of $A$ with respect to this basis is

$[A]_{v_1, \dots, v_n}= \mat \lambda_1 &0 & 0 & \dots & 0\\ 0 &\lambda_2 & 0 & \dots & 0\\ 0&0&\lambda_3 & \dots & 0\\ & & & \ddots & \\ 0 &0 & 0 & \dots & \lambda_n\\ \rix$

This follows directly from the fact that $Av_i=\lambda_i v_i$ for $i=1,2,\dots, n$ .

Diagonalization Theorem (version 2). If an $n\times n$ matrix $A$ has a basis $v_1, \dots, v_n\in\F^n$ of eigenvectors, with corresponding eigenvalues $\lambda_1$ , … , $\lambda_n$ , then we have

$S^{-1}AS = D,$

where

$S=\bigl[ v_1 \; v_2\; \cdots v_n\bigr]$

is the matrix whose columns are the eigenvectors, and $D$ is the diagonal matrix

$D = \mat \lambda_1 &0 & 0 & \dots & 0\\ 0 &\lambda_2 & 0 & \dots & 0\\ 0&0&\lambda_3 & \dots & 0\\ & & & \ddots & \\ 0 &0 & 0 & \dots & \lambda_n\\ \rix$

Similarity of linear transformations

Definition. Two linear transformations $A:V\to V$ and $B:W\to W$ are called similar if there is an invertible linear transformation $S:V\to W$ such that

$SA=BS, \text{ or } A=S^{-1}BS, \text{ or } SAS^{-1} = B.$

The three conditions are equivalent.