In the theory of inner product spaces we assume that the number field F is
either the real numbers or the complex numbers. The theories for real and
complex inner products are very similar.
In this chapter we always assume
F=R or F=C
Inner Products
Definition. An real inner product on a real vector space V is a real valued function on V×V, usually written as (x,y) or ⟨x,y⟩ that satisfies the following properties
⟨x,y⟩=⟨y,x⟩
⟨ax,y⟩=a⟨x,y⟩
⟨x+y,z⟩=⟨x,z⟩+⟨y,z⟩
⟨x,x⟩>0 if x=0
for all x,y,z∈V and a∈R.
Definition. An complex inner product on a complex vector space V is a complex valued function on V×V, usually written as (x,y) or ⟨x,y⟩ that satisfies the following properties
⟨x,y⟩=⟨y,x⟩ where z is the complex conjugate of z∈C
⟨ax,y⟩=a⟨x,y⟩
⟨x+y,z⟩=⟨x,z⟩+⟨y,z⟩
⟨x,x⟩>0 if x=0
for all x,y,z∈V and a∈C.
Examples
Rn and the dot product
On Rn we have the dot-product from vector calculus, i.e.
⟨x,y⟩=x⋅y=defx1y1+⋯+xnyn
for any two vectors x=(x1…xn), y=(y1…yn).
A variation on this example is the weighted dot product given by
⟨x,y⟩w=defw1x1y1+⋯+wnxnyn
for all x,y∈Rn, and where w1,…,wn>0 are given constants, called the weights in the inner product.
Cn and the complex dot product
On Rn we have the dot-product from vector calculus, i.e.
⟨x,y⟩=x⋅y=defx1y1+⋯+xnyn
for any two vectors x=(x1…xn), y=(y1…yn)∈Cn.
In the complex case one can also consider weighted dot products given by
⟨x,y⟩w=defw1x1y1+⋯+wnxnyn
for all x,y∈Cn, and where the weightsw1,…,wn>0 are again given constants.
Inner product on a Function Space
Let H be the space of 2π-periodic continuous functions f:R→C. Then
⟨f,g⟩=def2π1∫02πf(t)g(t)dt
defines an inner product on H.
This inner product plays a large role in Quantum Mechanics, and in the theory of Fourier series.
Inner Product when you have a Basis
If {v1,…,vn} is a basis for a complex vector space, and if x,y∈V satisfy
x=x1v1+⋯+xnvn,y=y1v1+⋯+ynvn,
then
⟨x,y⟩=i=1∑nj=1∑ngijxiyj where gij=def⟨vi,vj⟩.
Norms, Distances, Angles, and Inequalities
Definitions and properties
Definition. Let V be a real or complex inner product space with inner product ⟨x,y⟩. Then the norm (or length) of a vector x∈V is
∥x∥=⟨x,x⟩.
Note that ⟨x,x⟩ is never negative, so the square root is always defined.
Theorem.
∥x∥>0 for all x∈V with x=0
∥ax∥=∣a∣∥x∥ for all x∈V and a∈F
∣⟨x,y⟩∣≤∥x∥∥y∥ (the Cauchy-Schwarz inequality)
∥x+y∥≤∥x∥+∥y∥ (the triangle inequality)
Definition. The distance between two vectorsx,y∈V is d(x,y)=def∥x−y∥.
Definition. Two vectors x,y∈V are said to be orthogonal if ⟨x,y⟩=0.
Notation: x⊥y means x,y are orthogonal.
In particular, the zero vector is orthogonal to every other vector, because ⟨x,0⟩=0 for all x∈V.
More generally, if V is a real inner product space, then the angle between two non-zero vectorsx,y∈V is defined to be
∠(x,y)=defarccos∥x∥∥y∥⟨x,y⟩
The Cauchy-Schwarz inequality implies −1≤∥x∥∥y∥⟨x,y⟩≤1 so the inverse cosine is always defined.
Example in R4
Find the lengths and the angle between the vectors u=(1023) and v=(−1213) with respect to to the standard inner product on R4. Also find the distance between u and v.
Definition. A set of vectors {v1,…,vk}⊂V is orthogonal if
vi=0 for all i
vi⊥vj for all i=j
An orthogonal set {v1,…,vk}⊂V is called orthonormal if ∥vi∥=1 for i=1,…,k.
Theorem. If {v1,…,vn}⊂V is orthogonal, then {v1,…,vn} is linearly independent.
If, in addition, n=dimV, then {v1,…,vn} is a basis for V.
In this case {v1,…,vn} is called an orthogonal basis.
For vectors expressed in terms of an orthogonal basis one has the following formulas for the inner product and norm: if x=x1v1+⋯+xnvn, y=y1v1+⋯+ynvn, then
Then {u1,…,un} is an orthonormal basis of V. More precisely, by induction on j one shows that
wj⊥{v1,…,vj−1} and hence uj⊥{v1,…,vj−1}
span{v1,…,vj}=span{u1,…,uj}
The above procedure that created the orthonormal basis {u1,…,un} from the given basis {v1,…,vn} is called Gramm—Schmidt orthognoalization.
Examples
in the plane
The vectors (11)∈R2 and (−11)∈R2 are orthogonal. Therefore they are independent, and, since there are two of them, they form a basis of R2. Since, they are orthogonal, {(11),(−11)} is an orthogonal basis for R2.
On the other hand ∥(11)∥=∥(−11)∥=2=1 so {(11),(−11)} is not an orthonormal basis for R2.
in Rn and Cn
The standard basis {e1,…,en} is orthonormal and hence is an orthonormal basis for Rn and also for Cn.
in the function space H
Take F=R.
We defined H to be the set of continuous functions f:R→R that are 2π-periodic.
H is a real vector space, and one defines an inner product on H by setting ⟨f,g⟩=2π1∫02πf(t)g(t)dt.
The set of functions {cost,cos2t,cos3t,…} is orthonormal. To prove this, compute the integrals
2π1∫02πcosntcosmtdt={10if n=mif n=m
We can add more functions to this set and still have an orthonormal set: if
β={1,cost,sint,cos2t,sin2t,cos3t,sin3t,…}
then β is an orthonormal set in H.
The theory of Fourier series says that β is a “basis” for H, in the sense that every function f∈H can be written as
for suitable a0,a1,b1,⋯∈R.
This statement does not quite fit in the linear algebra from this course because the sum above contains infinitely many terms.
The Adjoint of an Operator
Definition. If V is a real or complex inner product space, and A:V→V is a linear operator, then A∗:V→V is called an adjoint of A if for all x,y∈V one has
⟨Ax,y⟩=⟨x,A∗y⟩.
Definition. An operator A:V→V is called self-adjoint if ⟨x,Ay⟩=⟨Ax,y⟩ for all x,y∈V.
Theorem. If B,C:V→V both are adjoints of A:V→V then B=C.
Because of this theorem we can speak of the adjoint instead of an adjoint.
Proof
We know that for all x,y∈V one has
⟨x,By⟩=⟨Ax,y⟩=⟨x,Cy⟩, and thus
⟨x,By−Cy⟩=⟨x,By⟩−⟨x,Cy⟩=0.
Given y∈V we now choose x=By−Cy and conclude
0=⟨By−Cy,By−Cy⟩=∥By−Cy∥2.
This implies By−Cy=0 and therefore By=Cy.
We have shown that By=Cy for all y∈V. This implies B=C.
Theorem. If V=Rn or Cn with the standard inner product, and if A:V→V is given by matrix multiplication with
A=⎝⎜⎛a11an1………a1nann⎠⎟⎞,
then the adjoint of A is given by matrix multiplication with the complex conjugate of the transpose of A, i.e. with
A∗=A⊤=⎝⎜⎛a11a1n………an1ann⎠⎟⎞.
In the real case there is no need to take the complex conjugate, and for real matrices one has A∗=A⊤.
Definition.
A real n×n matrix is called symmetric if A=A⊤.
A complex n×n matrix is called hermitian if A=A⊤.
In both cases the matrix, and the corresponding linear operator A:Fn→Fn are called self-adjoint.
Theorem. All eigenvalues of a self-adjoint operator are real
Proof
If Av=λv, then
λ∥v∥2=⟨Av,v⟩=⟨v,A∗v⟩=⟨v,Av⟩=⟨v,λv⟩=λ⟨v,v⟩=λ∥v∥2.
Since ∥v∥=0, this implies λ=λ, i.e. λ is real.
Theorem. If A:V→V is a self adjoint operator and if v,w are eigenvectors corresponding to different eigenvalues λ=μ, then v⊥w.
Proof
Since A is self adjoint, all its eigenvalues are real, and thus λ,μ∈R.
We have Av=λv and Aw=μw, and therefore
λ⟨v,w⟩=⟨Av,w⟩=⟨v,Aw⟩=⟨v,μw⟩=μ⟨v,w⟩=μ⟨v,w⟩
because μ∈R.
We find
(λ−μ)⟨v,w⟩=0.
The eigenvalues λ and μ are different, so λ−μ=0. Therefore ⟨v,w⟩=0.
The Spectral Theorem. Let V be a finite dimensional inner product space, and let A:V→V be self-adjoint. Then V has an orthonormal basis consisting of eigenvectors of A.
Lemma. Let v be an eigenvector of a self-adjoint operator A:V→V and consider the set L={x∈V∣x⊥v}. Then
L is a linear subspace of V
if dimV<∞ then dimL=dimV−1
L is invariant under A, i.e. for all x∈L one has Ax∈L
Proof of the Lemma
Exercise!
Proof of the spectral theorem
We use induction on n=dimV.
A is self-adjoint so all its eigenvalues are real. Let v be an eigenvector with eigenvalue λ∈R: Av=λv and v=0.
We may assume ∥v∥=1.
Define L={x∈V∣x⊥v}. The Lemma implies that L is invariant under A. Then A:L→L is also self-adjoint and, since dimL<dimV, there is an orthonormal basis {v1,…,vn−1} of L consisting of eigenvectors of A.
Since v⊥{v1,…,vn−1} the set {v1,…,vn−1,v} is an orthonormal, and therefore linearly independent set of vectors in V. Moreover, the set {v1,…,vn−1,v} contains exactly dimV vectors, so that {v1,…,vn−1,v} is a basis of V.