Inner Product Spaces

In the theory of inner product spaces we assume that the number field F\F is either the real numbers or the complex numbers. The theories for real and complex inner products are very similar.

In this chapter we always assume

F=R or F=C\F=\R \text{ or } \F=\C

Inner Products

Definition. An real inner product on a real vector space VV is a real valued function on V×VV\times V, usually written as (x,y)(x,y) or x,y\langle x, y\rangle that satisfies the following properties

for all x,y,zVx,y,z\in V and aRa\in\R.

Definition. An complex inner product on a complex vector space VV is a complex valued function on V×VV\times V, usually written as (x,y)(x,y) or x,y\langle x, y\rangle that satisfies the following properties

for all x,y,zVx,y,z\in V and aCa\in\C.

Examples

Rn\R^n and the dot product

On Rn\R^n we have the dot-product from vector calculus, i.e.

x,y=xy=defx1y1++xnyn\langle x, y\rangle = x\cdot y\stackrel{\rm def}=x_1y_1+\cdots+x_ny_n

for any two vectors x=(x1xn)x=\tmat x_1\\\dots\\x_n\trix, y=(y1yn)y=\tmat y_1\\\dots\\y_n\trix.

A variation on this example is the weighted dot product given by

x,yw=defw1x1y1++wnxnyn\langle x, y\rangle_w\stackrel{\rm def}=w_1x_1y_1+\cdots+w_nx_ny_n

for all x,yRnx,y\in\R^n, and where w1,,wn>0w_1, \dots, w_n>0 are given constants, called the weights in the inner product.

Cn\C^n and the complex dot product

On Rn\R^n we have the dot-product from vector calculus, i.e.

x,y=xy=defx1y1++xnyn\langle x, y\rangle = x\cdot y\stackrel{\rm def}=x_1\overline{y_1}+\cdots+x_n\overline{y_n}

for any two vectors x=(x1xn)x=\tmat x_1\\\dots\\x_n\trix, y=(y1yn)Cny=\tmat y_1\\\dots\\y_n\trix\in\C^n.

In the complex case one can also consider weighted dot products given by

x,yw=defw1x1y1++wnxnyn\langle x, y\rangle_w \stackrel{\rm def}=w_1x_1\overline{y_1}+\cdots+w_nx_n\overline{y_n}

for all x,yCnx,y\in\C^n, and where the weights w1,,wn>0w_1, \dots, w_n>0 are again given constants.

Inner product on a Function Space

Let H\cH be the space of 2π2\pi-periodic continuous functions f:RCf:\R\to\C. Then

f,g=def12π02πf(t)g(t)dt\left\langle f,g\right\rangle \stackrel{\rm def}{=} \frac{1}{2\pi}\int_0^{2\pi} f(t)\overline{g(t)} \, dt

defines an inner product on H\cH.

This inner product plays a large role in Quantum Mechanics, and in the theory of Fourier series.

Inner Product when you have a Basis

If {v1,,vn}\{v_1, \dots, v_n\} is a basis for a complex vector space, and if x,yVx,y\in V satisfy

x=x1v1++xnvn,y=y1v1++ynvn,x=x_1v_1+\cdots+x_nv_n,\quad y=y_1v_1+\cdots+y_nv_n,

then

x,y=i=1nj=1ngijxiyj where gij=defvi,vj.\left\langle x, y\right\rangle = \sum_{i=1}^n \sum_{j=1}^n g_{ij} x_i\overline{y_j} \quad \text{ where } \quad g_{ij}\stackrel{\rm def}{=} \left\langle v_i, v_j\right\rangle.

Norms, Distances, Angles, and Inequalities

Definitions and properties

Definition. Let VV be a real or complex inner product space with inner product x,y\left\langle x, y\right\rangle. Then the norm (or length) of a vector xVx\in V is

x=x,x.\|x\| = \sqrt{\left\langle x, x\right\rangle} .

Note that x,x\langle x, x\rangle is never negative, so the square root is always defined.

Theorem.

Definition. The distance between two vectors x,yVx, y\in V is d(x,y)=defxyd(x, y) \stackrel{\rm def}{=} \|x-y\|.

Definition. Two vectors x,yVx,y\in V are said to be orthogonal if x,y=0\langle x, y\rangle =0.

Notation: xyx\perp y means x,yx, y are orthogonal.

In particular, the zero vector is orthogonal to every other vector, because x,0=0\langle x, 0\rangle =0 for all xVx\in V.

More generally, if VV is a real inner product space, then the angle between two non-zero vectors x,yVx,y\in V is defined to be

(x,y)=defarccosx,yxy\angle(x, y) \stackrel{\rm def} {=} \arccos \frac{\langle x, y\rangle}{\|x\|\, \|y\|}

The Cauchy-Schwarz inequality implies 1x,yxy1-1\leq \frac{\langle x, y\rangle}{\|x\|\, \|y\|}\leq 1 so the inverse cosine is always defined.

Example in R4\R^4

Find the lengths and the angle between the vectors u=(1023)u = \tmat 1 \\ 0 \\ 2 \\ 3 \trix and v=(1213)v = \tmat -1\\2\\1\\3\trix with respect to to the standard inner product on R4\R^4. Also find the distance between uu and vv.

We have

u=12+02+22+32=14v=(1)2+22+12+32=15u,v=1(1)+02+21+33=10.\begin{aligned} \|u\| &= \sqrt{1^2+0^2+2^2+3^2}=\sqrt{14}\\ \|v\| &= \sqrt{(-1)^2+2^2+1^2+3^2}=\sqrt{15}\\ \langle u,v\rangle&= 1\cdot(-1)+0\cdot2+2\cdot1+3\cdot3= 10\,. \end{aligned}

Therefore

(u,v)=arccos101415=arccos1021046.3647\angle(u, v)=\arccos\frac{10}{\sqrt{14}\sqrt{15}} =\arccos\frac{10}{\sqrt{210}} \approx 46.3647\dots^\circ

Finally, the distance between uu and vv is

uv=(1(1))2+(02)2+(21)2+(33)2=10.\|u-v\|=\sqrt{(1-(-1))^2+(0-2)^2+(2-1)^2+(3-3)^2} =\sqrt{10}\,.

Example in a function space

Find the lengths and angle between the functions f(t)=1f(t)=1 and f(t)=t2f(t)=t^2 in the real function space V=C([0,1])V=C([0,1]) with inner product f,g=01f(t)g(t)dt\langle f, g\rangle=\int_0^1f(t)g(t)dt.

By definition we have

f=0112dt=1=1g=01(t2)2dt=01t4dt=15=155f,g=011t2dt=13cos(f,g)=1/31155=135    (f,g)=arccos5341.81\begin{aligned} \|f\|&= \sqrt{\int_0^1 1^2 dt} =\sqrt 1 =1\\ \|g\|&= \sqrt{\int_0^1 \left(t^2\right)^2 dt} =\sqrt{\int_0^1 t^4 dt} =\sqrt{\frac15} =\frac15\sqrt5\\ \langle f, g\rangle &= \int_0^1 1\cdot t^2\, dt =\frac13\\ \cos\angle(f,g)&=\frac{1/3}{1\cdot\frac15\sqrt5}=\frac13\sqrt5 \implies \angle(f,g) = \arccos\frac{\sqrt5}{3} \approx 41.81\dots^\circ \end{aligned}

Orthogonal sets of vectors

Let VV be a real or complex inner product space

Definition. A set of vectors {v1,,vk}V\{v_1, \dots, v_k\}\subset V is orthogonal if

An orthogonal set {v1,,vk}V\{v_1, \dots, v_k\}\subset V is called orthonormal if vi=1\|v_i\|=1 for i=1,,ki=1, \dots, k.

Theorem. If {v1,,vn}V\{v_1, \dots, v_n\}\subset V is orthogonal, then {v1,,vn}\{v_1, \dots, v_n\} is linearly independent.

If, in addition, n=dimVn=\dim V, then {v1,,vn}\{v_1, \dots, v_n\} is a basis for VV. In this case {v1,,vn}\{v_1, \dots, v_n\} is called an orthogonal basis.

For vectors expressed in terms of an orthogonal basis one has the following formulas for the inner product and norm: if x=x1v1++xnvnx=x_1v_1+\cdots+x_nv_n, y=y1v1++ynvny=y_1v_1+\cdots+y_nv_n, then

x,y=w1x1y1++wnxnyn,x2=w1x12++wnxn2\left\langle x, y\right\rangle= w_1x_1\overline{y_1}+\cdots+w_nx_n\overline{y_n}, \qquad \|x\|^2=w_1|x_1|^2+\cdots+w_n|x_n|^2

where the weights wiw_i are given by wi=vi2w_i = \|v_i\|^2.

If the basis {v1,,vn}\{v_1, \dots, v_n\} is orthonormal, then wi=vi2=1w_i=\|v_i\|^2=1 for all ii, and thus

x,y=x1y1++xnyn,x2=x12++xn2\left\langle x, y\right\rangle= x_1\overline{y_1}+\cdots+x_n\overline{y_n}, \qquad \|x\|^2=|x_1|^2+\cdots+|x_n|^2

Theorem. Every finite dimensional inner product space has an orthonormal basis.
Proof using the Gramm—Schmidt procedure

Let {v1,,vm}\{v_1, \dots, v_m\} be a basis of VV.
Define

w1=v1u1=v1v1w2=v2v2,u1u1u2=w2w2w3=v3v3,u1u1v3,u2u2u3=w3w3\begin{aligned} w_1&=v_1 &u_1 &= \frac{v_1}{\|v_1\|}\\ w_2&= v_2 - \left\langle v_2, u_1\right\rangle u_1& u_2&=\frac{w_2}{\|w_2\|}\\ w_3&= v_3 - \langle v_3, u_1\rangle u_1- \langle v_3, u_2\rangle u_2& u_3&=\frac{w_3}{\|w_3\|} \end{aligned}

and in general,

wj=vjvj,u1u1vj,uj1uj1,uj=wjwj\begin{aligned} w_j&=v_j-\langle v_j, u_1\rangle u_1- \cdots- \langle v_j, u_{j-1}\rangle u_{j-1}, \\ u_j&= \frac{w_j}{\|w_j\|} \end{aligned}

Then {u1,,un}\{u_1, \dots, u_n\} is an orthonormal basis of VV. More precisely, by induction on jj one shows that

The above procedure that created the orthonormal basis {u1,,un}\{u_1, \dots, u_n\} from the given basis {v1,,vn}\{v_1, \dots, v_n\} is called Gramm—Schmidt orthognoalization.

Examples

in the plane

The vectors (11)R2\binom11\in\R^2 and (11)R2\binom{1}{-1}\in\R^2 are orthogonal. Therefore they are independent, and, since there are two of them, they form a basis of R2\R^2. Since, they are orthogonal, {(11),(11)}\left\{\binom11, \binom{1}{-1}\right\} is an orthogonal basis for R2\R^2.

On the other hand (11)=(11)=21\|\binom11\|=\|\binom1{-1}\|=\sqrt{2}\neq 1 so {(11),(11)}\left\{\binom11, \binom{1}{-1}\right\} is not an orthonormal basis for R2\R^2.

in Rn\R^n and Cn\C^n

The standard basis {e1,,en}\{e_1, \dots, e_n\} is orthonormal and hence is an orthonormal basis for Rn\R^n and also for Cn\C^n.

in the function space H\cH

Take F=R\F=\R.

We defined H\cH to be the set of continuous functions f:RRf:\R\to\R that are 2π2\pi-periodic.

H\cH is a real vector space, and one defines an inner product on H\cH by setting f,g=12π02πf(t)g(t)dt\langle f, g\rangle = \frac{1}{2\pi} \int_0^{2\pi} f(t)g(t)\,dt.

The set of functions {cost,cos2t,cos3t,}\{\cos t, \cos 2t, \cos 3t, \dots \} is orthonormal. To prove this, compute the integrals

12π02πcosntcosmtdt={1if n=m0if nm\frac{1}{2\pi}\int_0^{2\pi} \cos nt\,\cos mt\, dt = \begin{cases} 1 & \text{if }n=m \\ 0 & \text{if }n\neq m \end{cases}

We can add more functions to this set and still have an orthonormal set: if

β={1,cost,sint,cos2t,sin2t,cos3t,sin3t,}\beta = \{1, \cos t, \sin t, \cos 2t, \sin 2t, \cos 3t, \sin 3t, \dots \}

then β\beta is an orthonormal set in H\cH.

The theory of Fourier series says that β\beta is a “basis” for H\cH, in the sense that every function fHf\in \cH can be written as

f(t)=a0+a1cost+b1sint+a2cos2t+b2sin2t+a3cos3t+b3sin3t+f(t) = a_0 + a_1\cos t+b_1\sin t+ a_2\cos 2t+b_2\sin 2t+ a_3\cos 3t+b_3\sin 3t + \cdots

for suitable a0,a1,b1,Ra_0, a_1, b_1, \dots \in\R. This statement does not quite fit in the linear algebra from this course because the sum above contains infinitely many terms.

The Adjoint of an Operator

Definition. If VV is a real or complex inner product space, and A:VVA:V\to V is a linear operator, then A:VVA^*:V\to V is called an adjoint of AA if for all x,yVx, y\in V one has

Ax,y=x,Ay.\left\langle Ax, y\right\rangle = \left\langle x, A^*y\right\rangle.

Definition. An operator A:VVA:V\to V is called self-adjoint if x,Ay=Ax,y\langle x, Ay\rangle = \langle Ax, y\rangle for all x,yVx, y\in V.

Theorem. If B,C:VVB, C:V\to V both are adjoints of A:VVA:V\to V then B=CB=C.

Because of this theorem we can speak of the adjoint instead of an adjoint.

Proof

We know that for all x,yVx, y\in V one has x,By=Ax,y=x,Cy\left\langle x, By\right\rangle = \left\langle Ax, y\right\rangle = \left\langle x, Cy\right\rangle, and thus

x,ByCy=x,Byx,Cy=0.\left\langle x, By-Cy\right\rangle= \left\langle x, By\right\rangle - \left\langle x, Cy\right\rangle=0.

Given yVy\in V we now choose x=ByCyx=By-Cy and conclude

0=ByCy,ByCy=ByCy2.0=\bigl\langle By-Cy, By-Cy\bigr\rangle =\left\|By-Cy\right\|^2.

This implies ByCy=0By-Cy=0 and therefore By=CyBy=Cy.

We have shown that By=CyBy=Cy for all yVy\in V. This implies B=CB=C.

Theorem. If V=RnV=\R^n or Cn\C^n with the standard inner product, and if A:VVA:V\to V is given by matrix multiplication with

A=(a11a1nan1ann),A= \mat a_{11} & \dots & a_{1n} \\ &\dots&\\ a_{n1} & \dots & a_{nn} \rix\,,

then the adjoint of AA is given by matrix multiplication with the complex conjugate of the transpose of AA, i.e. with

A=A=(a11an1a1nann).A^* = \overline{A^\top}= \mat \overline{a_{11}} & \dots & \overline{a_{n1}} \\ &\dots&\\ \overline{a_{1n}} & \dots & \overline{a_{nn}} \rix\,.

In the real case there is no need to take the complex conjugate, and for real matrices one has A=AA^*=A^\top.

Definition.

In both cases the matrix, and the corresponding linear operator A:FnFnA:\F^n\to\F^n are called self-adjoint.

Theorem. All eigenvalues of a self-adjoint operator are real

Proof

If Av=λvAv=\lambda v, then

λv2=Av,v=v,Av=v,Av=v,λv=λv,v=λv2.\lambda \|v\|^2 = \left\langle Av, v\right\rangle = \left\langle v, A^*v\right\rangle = \left\langle v, Av\right\rangle = \left\langle v, \lambda v\right\rangle = \overline{\lambda} \left\langle v, v\right\rangle = \overline{\lambda} \|v\|^2.

Since v0\|v\|\neq 0, this implies λ=λ\lambda=\overline{\lambda}, i.e. λ\lambda is real.

Theorem. If A:VVA:V\to V is a self adjoint operator and if v,wv, w are eigenvectors corresponding to different eigenvalues λμ\lambda\neq \mu, then vwv\perp w.

Proof

Since AA is self adjoint, all its eigenvalues are real, and thus λ,μR\lambda, \mu\in\R.

We have Av=λvAv=\lambda v and Aw=μwAw=\mu w, and therefore

λv,w=Av,w=v,Aw=v,μw=μv,w=μv,w\lambda \left\langle v, w\right\rangle = \left\langle Av, w\right\rangle = \left\langle v, Aw\right\rangle = \left\langle v, \mu w\right\rangle = \overline{\mu}\left\langle v, w\right\rangle = \mu\left\langle v, w\right\rangle

because μR\mu\in\R. We find

(λμ)v,w=0.(\lambda-\mu)\left\langle v, w\right\rangle =0.

The eigenvalues λ\lambda and μ\mu are different, so λμ0\lambda-\mu\neq0. Therefore v,w=0\langle v, w\rangle =0.

The Spectral Theorem. Let VV be a finite dimensional inner product space, and let A:VVA:V\to V be self-adjoint. Then VV has an orthonormal basis consisting of eigenvectors of AA.

Lemma. Let vv be an eigenvector of a self-adjoint operator A:VVA:V\to V and consider the set L={xVxv}L=\{x\in V \mid x\perp v\}. Then

Proof of the Lemma Exercise!
Proof of the spectral theorem

We use induction on n=dimVn=\dim V.

AA is self-adjoint so all its eigenvalues are real. Let vv be an eigenvector with eigenvalue λR\lambda\in\R: Av=λvAv=\lambda v and v0v\neq 0.

We may assume v=1\|v\|=1.

Define L={xVxv}L=\{x\in V \mid x\perp v\}. The Lemma implies that LL is invariant under AA. Then A:LLA:L\to L is also self-adjoint and, since dimL<dimV\dim L<\dim V, there is an orthonormal basis {v1,,vn1}\{v_1, \dots, v_{n-1}\} of LL consisting of eigenvectors of AA.

Since v{v1,,vn1}v\perp\{v_1, \dots, v_{n-1}\} the set {v1,,vn1,v}\{v_1, \dots, v_{n-1}, v\} is an orthonormal, and therefore linearly independent set of vectors in VV. Moreover, the set {v1,,vn1,v}\{v_1, \dots, v_{n-1}, v\} contains exactly dimV\dim V vectors, so that {v1,,vn1,v}\{v_1, \dots, v_{n-1}, v\} is a basis of VV.