Inner Product Spaces

In the theory of inner product spaces we assume that the number field $\F$ is either the real numbers or the complex numbers. The theories for real and complex inner products are very similar.

In this chapter we always assume

$\F=\R \text{ or } \F=\C$

Inner Products

Definition. An real inner product on a real vector space $V$ is a real valued function on $V\times V$ , usually written as $(x,y)$ or $\langle x, y\rangle$ that satisfies the following properties

$\langle x, y\rangle = \langle y, x\rangle$
$\langle ax, y\rangle = a \langle x, y\rangle$
$\langle x+y, z\rangle = \langle x, z\rangle + \langle y, z\rangle$
$\langle x,x\rangle \gt 0$ if $x\neq 0$

for all $x,y,z\in V$ and $a\in\R$ .

Definition. An complex inner product on a complex vector space $V$ is a complex valued function on $V\times V$ , usually written as $(x,y)$ or $\langle x, y\rangle$ that satisfies the following properties

$\langle x, y\rangle = \overline{\langle y, x\rangle}$ where $\overline{z}$ is the complex conjugate of $z\in\C$
$\langle ax, y\rangle = a \langle x, y\rangle$
$\langle x+y, z\rangle = \langle x, z\rangle + \langle y, z\rangle$
$\langle x,x\rangle \gt 0$ if $x\neq 0$

for all $x,y,z\in V$ and $a\in\C$ .

Examples

$\R^n$ and the dot product

On $\R^n$ we have the dot-product from vector calculus, i.e.

$\langle x, y\rangle = x\cdot y\stackrel{\rm def}=x_1y_1+\cdots+x_ny_n$

for any two vectors $x=\tmat x_1\\\dots\\x_n\trix$ , $y=\tmat y_1\\\dots\\y_n\trix$ .

A variation on this example is the weighted dot product given by

$\langle x, y\rangle_w\stackrel{\rm def}=w_1x_1y_1+\cdots+w_nx_ny_n$

for all $x,y\in\R^n$ , and where $w_1, \dots, w_n>0$ are given constants, called the weights in the inner product.

$\C^n$ and the complex dot product

On $\R^n$ we have the dot-product from vector calculus, i.e.

$\langle x, y\rangle = x\cdot y\stackrel{\rm def}=x_1\overline{y_1}+\cdots+x_n\overline{y_n}$

for any two vectors $x=\tmat x_1\\\dots\\x_n\trix$ , $y=\tmat y_1\\\dots\\y_n\trix\in\C^n$ .

In the complex case one can also consider weighted dot products given by

$\langle x, y\rangle_w \stackrel{\rm def}=w_1x_1\overline{y_1}+\cdots+w_nx_n\overline{y_n}$

for all $x,y\in\C^n$ , and where the weights $w_1, \dots, w_n>0$ are again given constants.

Inner product on a Function Space

Let $\cH$ be the space of $2\pi$ -periodic continuous functions $f:\R\to\C$ . Then

$\left\langle f,g\right\rangle \stackrel{\rm def}{=} \frac{1}{2\pi}\int_0^{2\pi} f(t)\overline{g(t)} \, dt$

defines an inner product on $\cH$ .

This inner product plays a large role in Quantum Mechanics, and in the theory of Fourier series.

Inner Product when you have a Basis

If $\{v_1, \dots, v_n\}$ is a basis for a complex vector space, and if $x,y\in V$ satisfy

$x=x_1v_1+\cdots+x_nv_n,\quad y=y_1v_1+\cdots+y_nv_n,$

then

$\left\langle x, y\right\rangle = \sum_{i=1}^n \sum_{j=1}^n g_{ij} x_i\overline{y_j} \quad \text{ where } \quad g_{ij}\stackrel{\rm def}{=} \left\langle v_i, v_j\right\rangle.$

Norms, Distances, Angles, and Inequalities

Definitions and properties

Definition. Let $V$ be a real or complex inner product space with inner product $\left\langle x, y\right\rangle$ . Then the norm (or length) of a vector $x\in V$ is

$\|x\| = \sqrt{\left\langle x, x\right\rangle} .$

Note that $\langle x, x\rangle$ is never negative, so the square root is always defined.

Theorem.

$\|x\|\gt 0$ for all $x\in V$ with $x\neq 0$
$\|ax\| = |a|\,\|x\|$ for all $x\in V$ and $a\in\F$
$\left|\left\langle x, y \right\rangle\right|\leq \|x\|\,\|y\|$ (the Cauchy-Schwarz inequality)
$\|x+y\|\leq \|x\|+\|y\|$ (the triangle inequality)

Definition. The distance between two vectors $x, y\in V$ is $d(x, y) \stackrel{\rm def}{=} \|x-y\|$ .

Definition. Two vectors $x,y\in V$ are said to be orthogonal if $\langle x, y\rangle =0$ .

Notation: $x\perp y$ means $x, y$ are orthogonal.

In particular, the zero vector is orthogonal to every other vector, because $\langle x, 0\rangle =0$ for all $x\in V$ .

More generally, if $V$ is a real inner product space, then the angle between two non-zero vectors $x,y\in V$ is defined to be

$\angle(x, y) \stackrel{\rm def} {=} \arccos \frac{\langle x, y\rangle}{\|x\|\, \|y\|}$

The Cauchy-Schwarz inequality implies $-1\leq \frac{\langle x, y\rangle}{\|x\|\, \|y\|}\leq 1$ so the inverse cosine is always defined.

Example in $\R^4$

Find the lengths and the angle between the vectors $u = \tmat 1 \\ 0 \\ 2 \\ 3 \trix$ and $v = \tmat -1\\2\\1\\3\trix$ with respect to to the standard inner product on $\R^4$ . Also find the distance between $u$ and $v$ .

We have

$\begin{aligned} \|u\| &= \sqrt{1^2+0^2+2^2+3^2}=\sqrt{14}\\ \|v\| &= \sqrt{(-1)^2+2^2+1^2+3^2}=\sqrt{15}\\ \langle u,v\rangle&= 1\cdot(-1)+0\cdot2+2\cdot1+3\cdot3= 10\,. \end{aligned}$

Therefore

$\angle(u, v)=\arccos\frac{10}{\sqrt{14}\sqrt{15}} =\arccos\frac{10}{\sqrt{210}} \approx 46.3647\dots^\circ$

Finally, the distance between $u$ and $v$ is

$\|u-v\|=\sqrt{(1-(-1))^2+(0-2)^2+(2-1)^2+(3-3)^2} =\sqrt{10}\,.$

Example in a function space

Find the lengths and angle between the functions $f(t)=1$ and $f(t)=t^2$ in the real function space $V=C([0,1])$ with inner product $\langle f, g\rangle=\int_0^1f(t)g(t)dt$ .

By definition we have

$\begin{aligned} \|f\|&= \sqrt{\int_0^1 1^2 dt} =\sqrt 1 =1\\ \|g\|&= \sqrt{\int_0^1 \left(t^2\right)^2 dt} =\sqrt{\int_0^1 t^4 dt} =\sqrt{\frac15} =\frac15\sqrt5\\ \langle f, g\rangle &= \int_0^1 1\cdot t^2\, dt =\frac13\\ \cos\angle(f,g)&=\frac{1/3}{1\cdot\frac15\sqrt5}=\frac13\sqrt5 \implies \angle(f,g) = \arccos\frac{\sqrt5}{3} \approx 41.81\dots^\circ \end{aligned}$

Orthogonal sets of vectors

Let $V$ be a real or complex inner product space

Definition. A set of vectors $\{v_1, \dots, v_k\}\subset V$ is orthogonal if

$v_i\neq 0$ for all $i$
$v_i\perp v_j$ for all $i\neq j$

An orthogonal set $\{v_1, \dots, v_k\}\subset V$ is called orthonormal if $\|v_i\|=1$ for $i=1, \dots, k$ .

Theorem. If $\{v_1, \dots, v_n\}\subset V$ is orthogonal, then $\{v_1, \dots, v_n\}$ is linearly independent.

If, in addition, $n=\dim V$ , then $\{v_1, \dots, v_n\}$ is a basis for $V$ . In this case $\{v_1, \dots, v_n\}$ is called an orthogonal basis.

For vectors expressed in terms of an orthogonal basis one has the following formulas for the inner product and norm: if $x=x_1v_1+\cdots+x_nv_n$ , $y=y_1v_1+\cdots+y_nv_n$ , then

$\left\langle x, y\right\rangle= w_1x_1\overline{y_1}+\cdots+w_nx_n\overline{y_n}, \qquad \|x\|^2=w_1|x_1|^2+\cdots+w_n|x_n|^2$

where the weights $w_i$ are given by $w_i = \|v_i\|^2$ .

If the basis $\{v_1, \dots, v_n\}$ is orthonormal, then $w_i=\|v_i\|^2=1$ for all $i$ , and thus

$\left\langle x, y\right\rangle= x_1\overline{y_1}+\cdots+x_n\overline{y_n}, \qquad \|x\|^2=|x_1|^2+\cdots+|x_n|^2$

Theorem. Every finite dimensional inner product space has an orthonormal basis.

Proof using the Gramm—Schmidt procedure

Let $\{v_1, \dots, v_m\}$ be a basis of $V$ .
Define

$\begin{aligned} w_1&=v_1 &u_1 &= \frac{v_1}{\|v_1\|}\\ w_2&= v_2 - \left\langle v_2, u_1\right\rangle u_1& u_2&=\frac{w_2}{\|w_2\|}\\ w_3&= v_3 - \langle v_3, u_1\rangle u_1- \langle v_3, u_2\rangle u_2& u_3&=\frac{w_3}{\|w_3\|} \end{aligned}$

and in general,

$\begin{aligned} w_j&=v_j-\langle v_j, u_1\rangle u_1- \cdots- \langle v_j, u_{j-1}\rangle u_{j-1}, \\ u_j&= \frac{w_j}{\|w_j\|} \end{aligned}$

Then $\{u_1, \dots, u_n\}$ is an orthonormal basis of $V$ . More precisely, by induction on $j$ one shows that

$w_j\perp \{v_1, \dots, v_{j-1}\}$ and hence $u_j\perp \{v_1, \dots, v_{j-1}\}$
$\mathrm{span}\{v_1, \dots, v_{j}\} = \mathrm{span}\{u_1, \dots, u_{j}\}$

The above procedure that created the orthonormal basis $\{u_1, \dots, u_n\}$ from the given basis $\{v_1, \dots, v_n\}$ is called Gramm—Schmidt orthognoalization.

Examples

in the plane

The vectors $\binom11\in\R^2$ and $\binom{1}{-1}\in\R^2$ are orthogonal. Therefore they are independent, and, since there are two of them, they form a basis of $\R^2$ . Since, they are orthogonal, $\left\{\binom11, \binom{1}{-1}\right\}$ is an orthogonal basis for $\R^2$ .

On the other hand $\|\binom11\|=\|\binom1{-1}\|=\sqrt{2}\neq 1$ so $\left\{\binom11, \binom{1}{-1}\right\}$ is not an orthonormal basis for $\R^2$ .

in $\R^n$ and $\C^n$

The standard basis $\{e_1, \dots, e_n\}$ is orthonormal and hence is an orthonormal basis for $\R^n$ and also for $\C^n$ .

in the function space $\cH$

Take $\F=\R$ .

We defined $\cH$ to be the set of continuous functions $f:\R\to\R$ that are $2\pi$ -periodic.

$\cH$ is a real vector space, and one defines an inner product on $\cH$ by setting $\langle f, g\rangle = \frac{1}{2\pi} \int_0^{2\pi} f(t)g(t)\,dt$ .

The set of functions $\{\cos t, \cos 2t, \cos 3t, \dots \}$ is orthonormal. To prove this, compute the integrals

$\frac{1}{2\pi}\int_0^{2\pi} \cos nt\,\cos mt\, dt = \begin{cases} 1 & \text{if }n=m \\ 0 & \text{if }n\neq m \end{cases}$

We can add more functions to this set and still have an orthonormal set: if

$\beta = \{1, \cos t, \sin t, \cos 2t, \sin 2t, \cos 3t, \sin 3t, \dots \}$

then $\beta$ is an orthonormal set in $\cH$ .

The theory of Fourier series says that $\beta$ is a “basis” for $\cH$ , in the sense that every function $f\in \cH$ can be written as

$f(t) = a_0 + a_1\cos t+b_1\sin t+ a_2\cos 2t+b_2\sin 2t+ a_3\cos 3t+b_3\sin 3t + \cdots$

for suitable $a_0, a_1, b_1, \dots \in\R$ . This statement does not quite fit in the linear algebra from this course because the sum above contains infinitely many terms.

The Adjoint of an Operator

Definition. If $V$ is a real or complex inner product space, and $A:V\to V$ is a linear operator, then $A^*:V\to V$ is called an adjoint of $A$ if for all $x, y\in V$ one has

$\left\langle Ax, y\right\rangle = \left\langle x, A^*y\right\rangle.$

Definition. An operator $A:V\to V$ is called self-adjoint if $\langle x, Ay\rangle = \langle Ax, y\rangle$ for all $x, y\in V$ .

Theorem. If $B, C:V\to V$ both are adjoints of $A:V\to V$ then $B=C$ .

Because of this theorem we can speak of the adjoint instead of an adjoint.

Proof

We know that for all $x, y\in V$ one has $\left\langle x, By\right\rangle = \left\langle Ax, y\right\rangle = \left\langle x, Cy\right\rangle$ , and thus

$\left\langle x, By-Cy\right\rangle= \left\langle x, By\right\rangle - \left\langle x, Cy\right\rangle=0.$

Given $y\in V$ we now choose $x=By-Cy$ and conclude

$0=\bigl\langle By-Cy, By-Cy\bigr\rangle =\left\|By-Cy\right\|^2.$

This implies $By-Cy=0$ and therefore $By=Cy$ .

We have shown that $By=Cy$ for all $y\in V$ . This implies $B=C$ .

Theorem. If $V=\R^n$ or $\C^n$ with the standard inner product, and if $A:V\to V$ is given by matrix multiplication with

$A= \mat a_{11} & \dots & a_{1n} \\ &\dots&\\ a_{n1} & \dots & a_{nn} \rix\,,$

then the adjoint of $A$ is given by matrix multiplication with the complex conjugate of the transpose of $A$ , i.e. with

$A^* = \overline{A^\top}= \mat \overline{a_{11}} & \dots & \overline{a_{n1}} \\ &\dots&\\ \overline{a_{1n}} & \dots & \overline{a_{nn}} \rix\,.$

In the real case there is no need to take the complex conjugate, and for real matrices one has $A^*=A^\top$ .

Definition.

A real $n\times n$ matrix is called symmetric if $A=A^\top$ .
A complex $n\times n$ matrix is called hermitian if $A=\overline{A^\top}$ .

In both cases the matrix, and the corresponding linear operator $A:\F^n\to\F^n$ are called self-adjoint.

Theorem. All eigenvalues of a self-adjoint operator are real

Proof

If $Av=\lambda v$ , then

$\lambda \|v\|^2 = \left\langle Av, v\right\rangle = \left\langle v, A^*v\right\rangle = \left\langle v, Av\right\rangle = \left\langle v, \lambda v\right\rangle = \overline{\lambda} \left\langle v, v\right\rangle = \overline{\lambda} \|v\|^2.$

Since $\|v\|\neq 0$ , this implies $\lambda=\overline{\lambda}$ , i.e. $\lambda$ is real.

Theorem. If $A:V\to V$ is a self adjoint operator and if $v, w$ are eigenvectors corresponding to different eigenvalues $\lambda\neq \mu$ , then $v\perp w$ .

Proof

Since $A$ is self adjoint, all its eigenvalues are real, and thus $\lambda, \mu\in\R$ .

We have $Av=\lambda v$ and $Aw=\mu w$ , and therefore

$\lambda \left\langle v, w\right\rangle = \left\langle Av, w\right\rangle = \left\langle v, Aw\right\rangle = \left\langle v, \mu w\right\rangle = \overline{\mu}\left\langle v, w\right\rangle = \mu\left\langle v, w\right\rangle$

because $\mu\in\R$ . We find

$(\lambda-\mu)\left\langle v, w\right\rangle =0.$

The eigenvalues $\lambda$ and $\mu$ are different, so $\lambda-\mu\neq0$ . Therefore $\langle v, w\rangle =0$ .

The Spectral Theorem. Let $V$ be a finite dimensional inner product space, and let $A:V\to V$ be self-adjoint. Then $V$ has an orthonormal basis consisting of eigenvectors of $A$ .

Lemma. Let $v$ be an eigenvector of a self-adjoint operator $A:V\to V$ and consider the set $L=\{x\in V \mid x\perp v\}$ . Then

$L$ is a linear subspace of $V$
if $\dim V<\infty$ then $\dim L=\dim V-1$
$L$ is invariant under $A$ , i.e. for all $x\in L$ one has $Ax\in L$

Proof of the Lemma

Exercise!

Proof of the spectral theorem

We use induction on $n=\dim V$ .

$A$ is self-adjoint so all its eigenvalues are real. Let $v$ be an eigenvector with eigenvalue $\lambda\in\R$ : $Av=\lambda v$ and $v\neq 0$ .

We may assume $\|v\|=1$ .

Define $L=\{x\in V \mid x\perp v\}$ . The Lemma implies that $L$ is invariant under $A$ . Then $A:L\to L$ is also self-adjoint and, since $\dim L<\dim V$ , there is an orthonormal basis $\{v_1, \dots, v_{n-1}\}$ of $L$ consisting of eigenvectors of $A$ .

Since $v\perp\{v_1, \dots, v_{n-1}\}$ the set $\{v_1, \dots, v_{n-1}, v\}$ is an orthonormal, and therefore linearly independent set of vectors in $V$ . Moreover, the set $\{v_1, \dots, v_{n-1}, v\}$ contains exactly $\dim V$ vectors, so that $\{v_1, \dots, v_{n-1}, v\}$ is a basis of $V$ .

Inner Products

Examples

Rn\R^nRn and the dot product

Cn\C^nCn and the complex dot product

Inner product on a Function Space

Inner Product when you have a Basis

Norms, Distances, Angles, and Inequalities

Definitions and properties

Example in R4\R^4R4

Example in a function space

Orthogonal sets of vectors

Examples

in the plane

in Rn\R^nRn and Cn\C^nCn

in the function space H\cHH

The Adjoint of an Operator

$\R^n$ and the dot product

$\C^n$ and the complex dot product

Example in $\R^4$

in $\R^n$ and $\C^n$

in the function space $\cH$