Eigenvalues and Eigenvectors

Basic definition and theorems

Definition. Let VV be a vector space and A:VVA:V\to V a linear transformation. If vVv\in V and λF\lambda\in\F satisfy

Av=λv,v0Av=\lambda v, \quad v\neq0

then vv is called an eigenvector of AA and λ\lambda is called an eigenvalue of AA.

Theorem. If A:VVA:V\to V is a linear transformation then

  1. vVv\in V is an eigenvector of AA with eigenvalue λ\lambda     \iff v0v\neq0 and vN(AλI)v\in N(A-\lambda I)
  2. λF\lambda\in\F is an eigenvalue of AA if and only if N(AλI){0}N(A-\lambda I)\neq \{0\}
  3. λF\lambda\in\F is an eigenvalue of AA if and only if det(AλI)=0\det(A-\lambda I)=0

This theorem gives a method of finding all eigenvalues&vectors, namely:

Independence of Eigenvectors Theorem. If v1,,vmVv_1, \dots, v_m\in V are eigenvectors of A:VVA:V\to V with eigenvalues λ1,,λm\lambda_1, \dots, \lambda_m, and if the eigenvalues are distinct (i.e. λiλj\lambda_i\neq \lambda_j for all iji\neq j) then {v1,,vm}\{v_1, \dots, v_m\} is linearly independent.

Proof. Suppose {v1,,vm}\{v_1, \dots, v_m\} is dependent, i.e.~there are numbers c1,,cmFc_1, \dots, c_m\in\F for which c1v1++cmvm=0c_1v_1+\cdots +c_mv_m=0, while not all cic_i vanish. After renumbering the terms we may assume that c10c_1\neq 0.

Consider the linear transformation

B=(Aλ2I)(Aλ3I)(AλmI)B = (A-\lambda_2 I)(A-\lambda_3 I)\cdots(A-\lambda_m I)

Then Bvi=0B v_i=0 for all i=2,3,,mi=2, 3, \dots, m, while

Bv1=(λ1λ2)(λ1λ3)(λ1λm)v10.Bv_1 = (\lambda_1-\lambda_2)(\lambda_1-\lambda_3)\cdots(\lambda_1-\lambda_m) v_1 \neq 0.

Applying BB to both sides in the equation c1v1++cmvm=0c_1v_1+\cdots +c_mv_m=0, we get

0=B(c1v1+c2v2++cmvm)=c1Bv1+c2Bv2++cmBvm=c1(λ1λ2)(λ1λ3)(λ1λm)v1.\begin{aligned} 0&=B\left(c_1v_1+c_2v_2+\cdots +c_mv_m\right)\\ &=c_1Bv_1+c_2Bv_2+\cdots +c_mBv_m\\ &=c_1(\lambda_1-\lambda_2)(\lambda_1-\lambda_3)\cdots(\lambda_1-\lambda_m) v_1 . \end{aligned}

Since v10v_1\neq 0, and since λ1λi0\lambda_1-\lambda_i\neq 0 for all i2i\geq 2, it follows that c1=0c_1 =0.

The use of eigenvalues and vectors

If a vector vVv\in V is known as a linear combination of eigenvectors of AA then it is easy to compute AvAv, A2vA^2v, etc. If

v=a1v1++akvkv=a_1v_1+\cdots+a_kv_k

where Avi=λiviAv_i=\lambda_iv_i, then

Av=λ1a1v1++λkakvkA2v=λ12a1v1++λk2akvkAv=λ1a1v1++λkakvk\begin{gathered} Av = \lambda_1a_1v_1 + \cdots + \lambda_k a_kv_k\\ A^2v = \lambda_1^2a_1v_1 + \cdots + \lambda_k^2 a_kv_k\\ \vdots\\ A^\ell v = \lambda_1^\ell a_1v_1 + \cdots + \lambda_k^\ell a_kv_k \end{gathered}

for any N\ell\in\N. If none of the eigenvalues λi\lambda_i vanishes then one solution of Aw=vAw=v is given by

w=a1λ1v1++akλkvk.w = \frac{a_1}{\lambda_1}v_1 + \cdots + \frac{a_k}{\lambda_k}v_k.

For this to be useful we need to find as many eigenvectors as we can.

Finding eigenvalues/vectors for A:FnFnA:\F^n\to\F^n

The characteristic polynomial of A:FnFnA:\F^n\to \F^n is, by definition,

det(AλI)=a11λa12a1na21a22λa2na11a12annλ\det(A-\lambda I) = \deter a_{11}-\lambda & a_{12} & \cdots & a_{1n} \\ a_{21}& a_{22}-\lambda & \cdots & a_{2n} \\ & & \ddots & \\ a_{11} & a_{12} & \cdots & a_{nn}-\lambda \minant

After expanding, this turns out to be a polynomial in λ\lambda of degree nn

det(AλI)=(λ)n+c1(λ)n1+c2(λ)n2++cn1(λ)+cn.\det(A-\lambda I) = (-\lambda)^n + c_1 (-\lambda)^{n-1} + c_2 (-\lambda)^{n-2} + \cdots + c_{n-1}(-\lambda) + c_n.

Here

c1=a11+a22++annc_1 = a_{11}+a_{22}+\cdots +a_{nn}

is called the trace of the matrix AA, and

cn=detAc_{n} = \det A

is its determinant.

Examples

Some 2×22\times 2 matrix A=(2227)A = \mat 2 & -2 \\ 2& 7\rix The Fibonacci matrix F=(1110)F= \mat 1 & 1 \\ 1& 0\rix
Rotation by 9090^\circ R=(0110)R = \mat 0 & -1 \\ 1 & 0\rix Reflection S=(0110)S = \mat 0 & 1\\ 1& 0\rix
Projection P=(1/21/21/21/2)P=\mat 1/2 &1/2 \\ 1/2 &1/2 \rix 3D rotation R=(010001100)R=\tmat 0&1&0\\0&0&1\\1&0&0\trix
Shear Z=(1101)Z=\mat 1 & 1 \\ 0 & 1 \rix

Diagonalization

Since the characteristic polynomial det(AλI)\det(A-\lambda I) is a polynomial of degree nn, it has at most nn zeroes. If it has nn distinct zeroes, λ1\lambda_1, … , λn\lambda_n then we choose an eigenvector viv_i for each eigenvalue λi\lambda_i. The vectors v1,,vnv_1, \dots, v_n are linearly independent in Fn\F^n and therefore they form a basis for Fn\F^n.

Diagonalization Theorem (version 1). If a linear transformation A:VVA:V\to V has a basis v1,,vnVv_1, \dots, v_n\in V of eigenvectors, with corresponding eigenvalues λ1\lambda_1, … , λn\lambda_n, then the matrix of AA with respect to this basis is

[A]v1,,vn=(λ10000λ20000λ30000λn)[A]_{v_1, \dots, v_n}= \mat \lambda_1 &0 & 0 & \dots & 0\\ 0 &\lambda_2 & 0 & \dots & 0\\ 0&0&\lambda_3 & \dots & 0\\ & & & \ddots & \\ 0 &0 & 0 & \dots & \lambda_n\\ \rix

This follows directly from the fact that Avi=λiviAv_i=\lambda_i v_i for i=1,2,,ni=1,2,\dots, n.

Diagonalization Theorem (version 2). If an n×nn\times n matrix AA has a basis v1,,vnFnv_1, \dots, v_n\in\F^n of eigenvectors, with corresponding eigenvalues λ1\lambda_1, … , λn\lambda_n, then we have

S1AS=D,S^{-1}AS = D,

where

S=[v1  v2  vn]S=\bigl[ v_1 \; v_2\; \cdots v_n\bigr]

is the matrix whose columns are the eigenvectors, and DD is the diagonal matrix

D=(λ10000λ20000λ30000λn)D = \mat \lambda_1 &0 & 0 & \dots & 0\\ 0 &\lambda_2 & 0 & \dots & 0\\ 0&0&\lambda_3 & \dots & 0\\ & & & \ddots & \\ 0 &0 & 0 & \dots & \lambda_n\\ \rix

Similarity of linear transformations

Definition. Two linear transformations A:VVA:V\to V and B:WWB:W\to W are called similar if there is an invertible linear transformation S:VWS:V\to W such that

SA=BS, or A=S1BS, or SAS1=B.SA=BS, \text{ or } A=S^{-1}BS, \text{ or } SAS^{-1} = B.

The three conditions are equivalent.

Solving cubic equations