Linear transformations

Contents

Linear Transformations

Definition. A map T:VWT:V\to W from one vector space VV to another WW is called linear if for all x,yVx, y\in V and all a,bFa, b\in\mathbb{F} one has T(ax+by)=aT(x)+bT(y)T(ax+by) = aT(x)+bT(y).

Synonym: a linear map is also called a linear transformation or a linear operator.

In this definition it is assumed that VV and WW are vector spaces over the same field F\mathbb{F}. If the two vector spaces are defined over different fields then the concept of linear map from VV to WW is not defined.

Sometimes it is more convenient to split the task of verifying that a certain linear map T:VWT:V\to W is linear into two steps:

Problem. Show that the definition implies that if T:VWT:V\to W is linear, then T(0V)=0WT(0_V)=0_W.

Examples from geometry

Rotations in the plane. Let V=W=R2V=W=\R^2 and F=R\mathbb{F}=\R and consider the map R:R2R2R:\R^2\to\R^2 defined by counterclockwise rotation by an angle of θ\theta radians. Show that RR is linear. Find a formula for R(x1x2)R\begin{pmatrix} x_1 \\ x_2 \end{pmatrix}.

Reflections in the plane. Again consider V=W=R2V=W=\R^2, and let R2\ell\subset\R^2 be some line through the origin. Define S(x)S(x) to be the reflection of xx in the line \ell.

Show that S:R2R2S:\R^2\to\R^2 is linear. and find a formula for S(x1x2)S\begin{pmatrix} x_1 \\ x_2 \end{pmatrix}, in the case where \ell is the diagonal x1=x2x_1=x_2.

Projection onto a line. Let V=R2V=\R^2 and WW be a line through the origin in R2\R^2. Consider the map P:VWP:V\to W for which PxPx is the orthogonal projection onto WW. Show that PP is a linear transformation.

Rigid rotation in R3\R^3. In this example, let V=W=R3V=W=\R^3, and let T:R3R3T:\R^3\to\R^3 be the map defined by first rotating around the zz-axis over 3030^\circ and then rotating around the xx-axis over 4545^\circ. Show that T:R3R3T:\R^3\to\R^3 is linear.

We postpone finding a formula for T(x1x2x3)T\left(\begin{smallmatrix} x_1 \\ x_2 \\x_3 \end{smallmatrix}\right) to the next chapter on matrices.

Examples from algebra

Let a,b,c,dFa,b,c,d\in\mathbb{F} be given numbers in the field F\mathbb{F} and consider the map T:F2F2T:\mathbb{F}^2 \to \mathbb{F}^2 given by

T(x1x2)=(ax1+bx2cx1+dx2)T\begin{pmatrix} x_1 \\ x_2 \end{pmatrix} =\begin{pmatrix} ax_1+bx_2 \\ cx_1+dx_2 \end{pmatrix}

for all (x1x2)F2\begin{pmatrix} x_1 \\ x_2 \end{pmatrix}\in\mathbb{F}^2. Show that TT is linear. Visualize the map TT you get if a=2a=2, b=c=0b=c=0, d=12d=\frac12.

Examples from differential equations

Let W=F(R,R)W=\mathcal{F}(\R,\R), and let VWV\subset W be the subspace of functions that are differentiable. Consider the map D:VWD:V\to W given by

(Df)(x)=f(x).(Df)(x) = f'(x).

Thus D(ex)=exD(e^x) = e^x, D(sinx)=cosxD(\sin x) = \cos x, D(x2)=2xD(x^2)=2x, etc.

Verify that D:VWD:V\to W is linear.

Null space and range; rank

Definition. If T:VWT:V\to W is a linear map, then

The null space is sometimes also called the kernel of the map and the notation N(T)=kerTN(T) = \mathrm{ker}\,T is sometimes used.

Theorem. The Null space a linear transformation T:VWT:V\to W is linear subspace of VV; the range of TT is a linear subspace of WW.

Proof

N(T)N(T)is a linear subspace:

It follows that N(T)N(T) is not empty, and closed under addition and scalar multiplication, so that N(T)N(T) is a linear subspace.

R(T)WR(T)\subset W is a linear subspace:

It follows that R(T)R(T) is not empty, and closed under addition and scalar multiplication, so that R(T)R(T) is a linear subspace.

Definition. The rank of TT is the dimension of the range of TT.

Injectivity Theorem. A linear map T:VWT:V\to W is injective if and only if N(T)={0}N(T)=\{0\}.

Proof

First we show that N(T)={0}N(T)=\{0\} implies that TT is injective.

Suppose N(T)={0}N(T)=\{0\}. To prove that TT is injective, we have to show for all x,yVx,y\in V that Tx=TyTx=Ty implies x=yx=y.

So let x,yVx,y\in V be given with Tx=TyTx=Ty. Then T(xy)=TxTy=0T(x-y) = Tx - Ty = 0. Therefore xyN(T)x-y\in N(T). Since N(T)={0}N(T)=\{0\} it follows that xy=0x-y=0, which implies that x=yx=y.

Next we show that if TT is injective then N(T)=0N(T)=0

Assume TT is injective. We must show that N(T)N(T) only contains the zero vector. Suppose xN(T)x\in N(T). Then Tx=0Tx=0. It is always true that T(0)=0T(0)=0. Since TT is injective and since T(x)=T(0)T(x) = T(0) we conclude that x=0x=0. So N(T)N(T) only contains the zero vector.

Rank+Nullity Theorem. If T:VWT:V\to W is linear, and if VV is finite dimensional, then

dimN(T)+dimR(T)=dimV.\dim N(T) + \dim R(T) = \dim V .

Proof

Choose a basis {v1,,vr}\{v_1, \dots, v_r\} of the null space N(T)N(T). Then choose vectors vr+1,,vnVv_{r+1}, \dots, v_n\in V so that {v1,,vr,vr+1,,vn}\{v_1, \dots, v_r, v_{r+1}, \dots, v_n\} is a basis for VV. We will show that β={Tvr+1,,Tvn}\beta = \{Tv_{r+1}, \dots, Tv_n\} is a basis for R(T)R(T). The rank++nullity formula then follows because we will have shown that dimV=n\dim V=n, dimN(T)=r\dim N(T)=r, and dimR(T)=nr\dim R(T)=n-r.

β\beta spans R(T)R(T): if yR(T)y\in R(T) then there is an xVx\in V with y=Txy=Tx. We can write x=x1v1++xnvnx=x_1v_1+\cdots+x_nv_n. Since Tv1==Tvr=0Tv_1=\cdots=Tv_r=0 we have

y=Tx=T(x1v1++xnvn)=xr+1Tvr+1++xnTvn=xr+1wr+1++xnwnspan(wr+1,,wn).\begin{aligned} y=Tx&=T(x_1v_1+\cdots+x_nv_n)\\ &=x_{r+1}Tv_{r+1}+\cdots+x_nTv_n \\ &=x_{r+1}w_{r+1} + \cdots +x_nw_n\\ &\in\mathrm{span}(w_{r+1}, \dots, w_n). \end{aligned}

β\beta is linearly independent: Suppose cr+1wr+1++cnwn=0c_{r+1}w_{r+1} + \cdots + c_nw_n=0 for certain cr+1,,cnFc_{r+1}, \dots, c_n\in\mathbb{F}. Then

T(cr+1vr+1++cnvn)=0,T\bigl(c_{r+1}v_{r+1} + \cdots + c_nv_n\bigr) = 0,

which implies cr+1vr+1++cnvnN(T)c_{r+1}v_{r+1} + \cdots + c_nv_n \in N(T). It follows that there are numbers c1,,crc_1, \dots, c_r such that

cr+1vr+1++cnvn=c1v1++crvr.c_{r+1}v_{r+1} + \cdots + c_nv_n = c_1v_1+\cdots+c_rv_r.

Since {v1,,vn}\{v_1, \dots, v_n\} is a basis for VV we conclude that c1==cn=0c_1=\cdots=c_n=0. Hence β\beta is linearly independent.

Injectivity Theorem. If T:VWT:V\to W is a linear transformation then TT is injective if and only if N(T)={0}N(T)=\{0\}.

Proof

TT injective implies N(T)={0}N(T)=\{0\}:

If xN(T)x\in N(T) then T(x)=0T(x)=0. Since T(0)=0T(0)=0 we have T(0)=T(x)T(0)=T(x). It follows from injectivity of TT that x=0x=0. Hence 00 is the only vector in N(T)N(T).

N(T)={0}N(T)=\{0\} implies TT is injective:

Suppose T(x)=T(y)T(x)=T(y). Then linearity of TT implies T(xy)=T(x)T(y)=0T(x-y)=T(x)-T(y)=0. By assumption N(T)={0}N(T)=\{0\}, so T(xy)=0T(x-y)=0 implies xy=0x-y=0, i.e. x=yx=y. This means that TT is injective.

Bijectivity Theorem. If VV and WW are finite dimensional vector spaces with the same dimension, and if T:VWT:V\to W is a linear transformation then the following are equivalent:

  1. TT is injective (one-to-one)
  2. N(T)={0}N(T)=\{0\}
  3. rankT=dimV\mathrm{rank}\,T = \dim V
  4. TT is surjective (onto)

A very important special case is when V=WV=W and VV is finite dimensional.

Proof

121\Longleftrightarrow 2: this is what the Injectivity Lemma says.

232\Longleftrightarrow 3: If N(T)={0}N(T)=\{0\} then dimN(T)=0\dim N(T)=0 and the rank+nullity theorem says dimV=dimR(T)+dimN(T)=dimR(T)=rankT\dim V = \dim R(T)+\dim N(T) = \dim R(T) = \mathrm{rank}\, T. Conversely, if rankT=dimV\mathrm{rank}\,T=\dim V then the rank+nullity theorem says that dimN(T)=0\dim N(T)=0, i.e. N(T)={0}N(T)=\{0\}.

3    43\implies 4: If rankT=dimV\mathrm{rank}\,T=\dim V then R(T)R(T) is a subspace of VV with the same dimension as WW. This implies R(T)=WR(T)=W, i.e. TT is onto.

4    34\implies 3: If TT is onto then R(T)=WR(T)=W, and thus rankT=dimR(T)=dimW=dimV\mathrm{rank}\,T=\dim R(T)=\dim W=\dim V.

Solving linear equations

Let T:VWT:V\to W be a linear transformation between vector spaces, and consider the equation

Tx=y.Tx=y.

Here yWy\in W is given and xVx\in V is the unknown. The standard questions are is there a solution? and if there is a solution, how many? Linear algebra provides the following answers:

Does Tx=yTx=y have a solution?

Tx=yTx=y has a solution if and only if yR(T)y\in R(T). This is the definition of the range of a linear transformation. What does this tell us? It doesn't say we can always solve Tx=yTx=y, but the set of yy for which there are solutions has a nice property — R(T)R(T) is a linear subspace of WW. Therefore, if we can find a solution to Tx1=y1Tx_1=y_1 and Tx2=y2Tx_2=y_2 then we can also find a solution to Tz=c1y1+c2y2Tz=c_1y_1+c_2y_2 (one solution is z=c1x1+c2x2z=c_1x_1+c_2x_2 — there might be others).

What is the form of the general solution to Tx=yTx=y?

Suppose that x,xVx, x'\in V both are solutions, i.e. T(x)=T(x)=yT(x)=T(x')=y. Then T(xx)=0T(x-x')=0, so xxN(T)x-x'\in N(T): so we see that the difference between any two solutions lies in the null space of TT.

Conversely, suppose that xx is a solution, i.e. T(x)=yT(x)=y, and suppose uN(T)u\in N(T). Then T(x+u)=T(x)+T(u)=y+0=yT(x+u)=T(x)+T(u)=y+0=y, i.e. x+ux+u is also a solution. Hence, given a solution xx to T(x)=yT(x)=y one can get another solution by adding any vector uu in the null space to xx

Particular solutions and the homogeneous equation. If xpVx_p\in V is a solution of T(x)=yT(x)=y, then every solution of T(x)=yT(x)=y is given by

x=xp+xhwith xhN(T).x = x_p+x_h\quad\text{with }x_h\in N(T).

In this context the following terminology is very often used:

If r=defdimN(T)<r\stackrel{\rm def}{=}\dim N(T)<\infty, and if we know a basis {u1,,ur}\{u_1, \dots, u_r\} for the null space N(T)N(T), then every vector xhx_h in the null space is given by

xh=c1u1++crurx_h = c_1u_1+\cdots+c_ru_r

for certain c1,,crFc_1, \dots, c_r\in\mathbb{F}. If we still know a particular solution xpx_p of T(x)=yT(x)=y then the general solution to the equation T(x)=yT(x)=y (i.e. every solution) is given by

x=xp+c1u1++crurx = x_p+c_1u_1+\cdots+c_ru_r

where c1,,crFc_1, \dots, c_r\in\mathbb{F} are arbitrary constants.

Systems of equations

Consider the linear transformation A:FnFmA:\mathbb{F}^n\to\mathbb{F}^m given by

A[x1xn]=def[a11x1++a1nxnam1x1++amnxn]A\begin{bmatrix} x_1\\ \vdots \\x_n \end{bmatrix} \stackrel{\rm def}{=} \begin{bmatrix} a_{11}x_1+\cdots+a_{1n}x_n \\ \vdots \\ a_{m1}x_1+\cdots+a_{mn}x_n \end{bmatrix}

For any vector y=[y1ym]Fmy = \left[\begin{smallmatrix} y_1\\\vdots\\y_m \end{smallmatrix}\right] \in \mathbb{F}^m the equation Ax=yAx=y is then equivalent with the system of linear equations for the unknowns x1,,xnx_1, \dots, x_n given by

a11x1++a1nxn=y1am1x1++amnxn=ym\begin{aligned} a_{11}x_1 + \cdots + a_{1n}x_n \,&= y_1 \\ \vdots\quad& \\ a_{m1}x_1 + \cdots + a_{mn}x_n &= y_m \end{aligned}

In other words, we are considering mm linear equations with nn unknowns.

In this setting the rank+nullity theorem says that dimN(A)+dimR(A)=dimV\dim N(A)+\dim R(A)=\dim V, i.e. dimN(A)+dimR(A)=n\dim N(A) + \dim R(A) = n.

More equations than unknowns, i.e. m>nm\gt n

It follows from dimN(A)+dimR(A)=n\dim N(A) + \dim R(A) = n that dimR(A)=ndimN(A)n<m\dim R(A) = n-\dim N(A) \leq n \lt m. So in this case R(A)R(A) is always a proper subspace of WW: the equation T(x)=yT(x)=y does not have a solution for most yy.

For those yFmy\in \mathbb{F}^m for which the system does have a solution the general solution contains rr constants, where r=dimN(A)=ndimR(A)r=\dim N(A) = n-\dim R(A).

More unknowns than equations, i.e. m<nm\lt n

Since R(A)RmR(A)\subset \R^m, we have dimR(A)m\dim R(A)\leq m. By the nullity+rank theorem,

dimN(A)=dimVdimR(A)=ndimR(A)nm>0.\dim N(A) = \dim V-\dim R(A) = n-\dim R(A)\geq n-m \gt 0.

So in this case the dimension of the null space always is positive, i.e. there are nonzero solutions to Ax=0Ax=0. For any yFmy\in \mathbb{F}^m one of the following occurs:

As many equations as unknowns, i.e. m=nm = n

We again have dimN(A)=ndimR(A)\dim N(A) = n-\dim R(A).

If AA is injective then N(A)={0}N(A)=\{0\}, and hence dimN(A)=0\dim N(A)=0, so that dimR(A)=n\dim R(A) = n. Since R(A)RnR(A)\subset \R^n, this implies that when AA is injective, AA also is surjective.

If on the other hand AA is not injective then dimN(A)>0\dim N(A)>0 and thus dimR(A)<n\dim R(A)<n. In this situation R(A)R(A) is a proper subspace of Rn\R^n and therefore Ax=yAx=y does not have a solution for all yy.

The components of a vector with respect to a basis

Definition. An ordered basis of a vector space VV is an ordered list of vectors β=(v1,,vn)\beta = (v_1, \dots, v_n) such that {v1,,vn}\{v_1, \dots, v_n\} is a basis of VV.

If β=(v1,,vn)\beta = (v_1, \dots, v_n) is an ordered basis of VV and xVx\in V is any vector then there exist x1,,xnFx_1, \dots, x_n\in \mathbb{F} such that

x=x1v1++xnvn.x = x_1v_1 + \cdots + x_nv_n \,.

The numbers x1,,xnx_1, \dots, x_n are called the components of xx with respect to the basis β\beta. These components determine a column vector. In the notation of the text book:

[x]β=(x1xn)Fn.[x]_\beta= \begin{pmatrix} x_1 \\ \vdots \\ x_n \end{pmatrix} \in\mathbb{F}^n.

Instead of components, the xix_i are sometimes also called the coordinates of xx.

Matrix representation of a linear transformation

Let T:VWT:V\to W be a linear transformation, and let β={v1,,vn}\beta = \{v_1, \dots, v_n\} be an ordered basis for VV and γ={w1,,wm}\gamma = \{w_1, \dots, w_m\} an ordered basis for WW. Each vector TviTv_i can be written as a linear combination of w1,,wmw_1, \dots, w_m, i.e. there exist numbers aijFa_{ij}\in\mathbb{F} such that

Tvi=a1iw1++amiwm(i=1,2,,n).Tv_i = a_{1i}w_1 + \cdots + a_{mi}w_m \qquad (i=1, 2, \dots, n).

Linearity of TT allows us to compute T(x)T(x) if we know the coefficients aija_{ij} and the components xjx_j of xx in the basis v1,,vnv_1, \dots, v_n. Namely one has:

Tx=T(x1v1++xnvn)=x1T(v1)++xnT(vn)=x1(a11w1++am1wm)++xn(a1nw1++amnwm)(rearrange terms)=(a11x1++a1nxn)w1++(am1x1++amnxn)wm\begin{aligned} Tx &= T(x_1v_1+\cdots +x_nv_n) \\ &=x_1 T(v_1) + \cdots + x_n T(v_n) \\ &=x_1 \left(a_{11}w_1 + \cdots + a_{m1}w_m\right) + \cdots + x_n \left(a_{1n}w_1 + \cdots + a_{mn}w_m\right) \\ &\qquad\text{(rearrange terms)} \\ &=\left(a_{11}x_1+\cdots+a_{1n}x_n\right) w_1 + \cdots + \left(a_{m1}x_1+\cdots+a_{mn}x_n\right) w_m \end{aligned}

Thus

[Tx]γ=(a11x1++a1nxnam1x1++amnxn)[Tx]_\gamma = \begin{pmatrix} a_{11}x_1+\cdots+a_{1n}x_n \\ \vdots \\ a_{m1}x_1+\cdots+a_{mn}x_n \end{pmatrix}

The coefficients aija_{ij} of the linear transformation TT with respect to the ordered bases β\beta and γ\gamma form a matrix

[T]βγ=(a11a1nam1amn)[T]_\beta^\gamma = \begin{pmatrix} a_{11} & \cdots & a_{1n} \\ \vdots & & \vdots \\ a_{m1} & \cdots & a_{mn} \end{pmatrix}

which is called the matrix representation of TT in the bases β,γ\beta, \gamma.

Since it turns out to be easy to confuse rows and columns the following observation may be helpful: the first column \tmat a_{11} \\ \cdot \\ \cdot \\ a_{m1} \trix of the matrix [T]βγ[T]_\beta^\gamma contains the components of Tv1Tv_1 expressed in the basis {w1,,wn}\{w_1, \dots, w_n\}.

Example. If m=3m=3 and n=2n=2, so that WW is three dimensional with basis γ={w1,w2,w3}\gamma=\{w_1, w_2, w_3\} and VV is two dimensional with basis β={v1,v2}\beta=\{v_1, v_2\}, and if

Tv1=w13w2+5w3,Tv2=w1,Tv_1 = w_1-3w_2+5w_3, \qquad Tv_2=-w_1,

then the matrix of TT in these bases is

[T]βγ=(113050).[T]_\beta^\gamma = \begin{pmatrix} 1 & -1 \\ -3 & 0 \\ 5 & 0 \end{pmatrix}.

Special case: A:FnFmA:\F^n\to\F^m

The vector space Fn\F^n has the standard basis {e1,,en}\{e_1, \dots, e_n\}. If the matrix of a linear transformation A:FmFnA:\F^m\to\F^n is given by (aij)(a_{ij}), then AA is given by matrix multiplication

Ax=(a11a1nam1amn)(x1xm)Ax = \begin{pmatrix} a_{11} & \cdots & a_{1n} \\ \vdots && \vdots \\ a_{m1} & \cdots & a_{mn} \end{pmatrix} \begin{pmatrix} x_1 \\ \vdots \\ x_m \end{pmatrix}

Composition of transformations and matrix multiplication

If we have three vector spaces U,V,WU, V, W and linear transformations A:VWA:V\to W and B:UVB:U\to V then we can define the composition AB:UWAB:U\to W by

(AB)(x)=defA(B(x)) for all xU.(AB)(x) \stackrel{\rm def}{=} A\bigl(B(x)\bigr) \text{ for all $x\in U$}.

Theorem. If A:VWA:V\to W and B:UVB:U\to V are linear transformations of vector spaces U,V,WU,V,W, then the composition AB:UWAB:U\to W is also linear.

The proof is a homework problem.

If we have ordered bases α={u1,,ul}\alpha=\{u_1, \dots, u_l\} for UU, β={v1,,vm}\beta=\{v_1,\dots, v_m\} for VV, and γ={w1,,wn}\gamma=\{w_1,\dots, w_n\} for WW then the matrices of AA and BB with respect to these bases are

[A]βγ=(a11a1man1anm),[B]αβ=(b11b1lbm1bml)[A]_\beta^\gamma = \begin{pmatrix} a_{11} & \cdots & a_{1m} \\ \vdots & & \vdots \\ a_{n1} & \cdots & a_{nm} \end{pmatrix}, \qquad [B]_\alpha^\beta = \begin{pmatrix} b_{11} & \cdots & b_{1l} \\ \vdots & & \vdots \\ b_{m1} & \cdots & b_{ml} \end{pmatrix}

Avj=a1jw1++anjwn,Buk=b1kv1++bmkvmAv_j = a_{1j}w_1+\cdots+a_{nj}w_n,\qquad Bu_k= b_{1k}v_1+\cdots+b_{mk}v_m

To find the matrix of ABAB with respect to the bases α,γ\alpha, \gamma we express AB(ui)AB(u_i) in terms of {w1,,wn}\{w_1, \dots, w_n\}:

AB(uk)=  A(Buk)=  A(b1kv1++bmkvm)=  b1kAv1++bmkAvm=  b1k{a11w1++an1wn}+  b2k{a12w1++an2wn}+  +  bmk{a1mw1++anmwn}=  {a11b1k++a1mbmk}w1++{an1b1k++anmbmk}wn\begin{aligned} AB(u_k) =&\; A\bigl(Bu_k\bigr) \\ =&\; A\left(b_{1k}v_1+\cdots+b_{mk}v_m \right) \\ =&\; b_{1k}Av_1+\cdots+b_{mk}Av_m \\ =&\; b_{1k}\left\{a_{11}w_1+\cdots+a_{n1}w_n\right\}+\\ &\; b_{2k}\left\{a_{12}w_1+\cdots+a_{n2}w_n\right\}+\\ &\;\quad\vdots\quad+\\ &\; b_{mk}\left\{a_{1m}w_1+\cdots+a_{nm}w_n\right\} \\ =&\; \left\{a_{11}b_{1k} + \cdots + a_{1m}b_{mk}\right\} w_1 + \cdots + \left\{a_{n1}b_{1k} + \cdots + a_{nm}b_{mk}\right\} w_n \end{aligned}

This shows that the kthk^{\rm th} column of the matrix [AB]αγ[AB]_\alpha^\gamma is given by

(a11b1k++a1mbmkan1b1k++anmbmk)\begin{pmatrix} a_{11}b_{1k} + \cdots + a_{1m}b_{mk} \\ \vdots \\ a_{n1}b_{1k} + \cdots + a_{nm}b_{mk} \end{pmatrix}

Definition. If A=(aij)\mathcal A = (a_{ij}) is an n×ln\times l matrix and B=(bjk)\mathcal B = (b_{jk}) is an l×ml\times m matrix then the matrix product AB\mathcal A\mathcal B is defined to be the n×mn\times m matrix C=(cik)\mathcal C=(c_{ik}) whose entries are given by

cij=ai1b1j++ailblj=k=1laikbkj.c_{ij} = a_{i1}b_{1j}+\cdots+a_{il}b_{lj} = \sum_{k=1}^l a_{ik}b_{kj} .

With this definition we have just shown the following

Theorem. [AB]αγ=[A]βγ[B]γα[AB]_\alpha^\gamma = [A]_\beta^\gamma\, [B]_\gamma^\alpha

Example. Let R(θ):R2R2R(\theta):\R^2\to\R^2 be rotation through an angle θ\theta. Then the matrix of R(θ)R(\theta) is given by

R(θ)=(cosθsinθsinθcosθ)\mathcal{R} (\theta) = \begin{pmatrix} \cos \theta & -\sin \theta \\ \sin\theta & \cos \theta \end{pmatrix}

If we first rotate by θ\theta and then by ϕ\phi we achieve the same as rotating by θ+ϕ\theta+\phi. This implies

R(θ+ϕ)=R(θ)R(ϕ).\cR(\theta+\phi) = \cR(\theta)\cR(\phi).

I.e.

(cos(θ+ϕ)sin(θ+ϕ)sin(θ+ϕ)cos(θ+ϕ))=(cosθsinθsinθcosθ)(cosϕsinϕsinϕcosϕ)=(cosθcosϕsinθsinϕsinθcosϕcosθsinϕsinθcosϕ+cosθsinϕcosθcosϕsinθsinϕ)\begin{aligned} &\begin{pmatrix} \cos (\theta+\phi) & -\sin (\theta+\phi) \\ \sin(\theta+\phi) & \cos (\theta+\phi) \end{pmatrix}\\ &\qquad= \begin{pmatrix} \cos \theta & -\sin \theta \\ \sin\theta & \cos \theta \end{pmatrix} \begin{pmatrix} \cos \phi & -\sin \phi \\ \sin\phi & \cos \phi \end{pmatrix}\\ &\qquad= \begin{pmatrix} \cos \theta \cos\phi - \sin\theta\sin\phi & -\sin \theta\cos\phi-\cos\theta\sin\phi \\ \sin\theta\cos\phi+\cos\theta\sin\phi & \cos \theta \cos\phi - \sin\theta\sin\phi \end{pmatrix} \end{aligned}

Thus we recover the addition formulas for sine and cosine:

cos(θ+ϕ)=cosθcosϕsinθsinϕsin(θ+ϕ)=sinθcosϕ+cosθsinϕ\begin{aligned} \cos(\theta+\phi) &= \cos \theta \cos\phi - \sin\theta \sin\phi \\ \sin(\theta+\phi) &= \sin\theta\cos\phi+\cos\theta\sin\phi \end{aligned}

The vector space L(V,W)\mathcal{L}(V,W)

The set of all linear transformations from one vector space VV to another WW, is itself a vector space over the same field F\mathbb{F}. Addition is defined by saying that for any two linear maps T,S:VWT,S:V\to W one has

(T+S)(x)=defT(x)+S(x), for all xV,(T+S)(x) \stackrel{\rm def}{=} T(x)+S(x), \text{ for all }x\in V,

and for any linear map T:VWT:V\to W and any number aFa\in\mathbb{F} one has

(aT)(x)=defa(T(x)) for all xV.(aT)(x) \stackrel{\rm def}{=} a\bigl(T(x)\bigr) \text{ for all }x\in V.

Notation. L(V,W)={TT:VW is a linear transformation}\cL(V,W)=\bigl\{T \mid T:V\to W \text{ is a linear transformation}\bigr\}

If V=WV=W then one writes L(V)\mathcal{L}(V) instead of L(V,W)\mathcal{L}(V, W).

Theorem. If T,S:VWT,S:V\to W are linear and if aFa\in\F then the maps T+S:VWT+S:V\to W and aT:VWaT:V\to W are linear. The set L(V,W)\mathcal{L}(V , W) of linear maps T:VWT:V\to W is a vector space.

Proof

To show that T+ST+S is linear consider

(T+S)(x+y)=T(x+y)+S(x+y)=Tx+Ty+Sx+Sy=Tx+Sx+Ty+Sy=(T+S)(x)+(T+S)(y).\begin{aligned} (T+S)(x+y) &= T(x+y)+S(x+y) \\ &= Tx + Ty+Sx+Sy\\ &= Tx + Sx+ Ty+Sy\\ &= (T+S)(x) + (T+S)(y). \end{aligned}

A similar computation shows that (T+S)(tx)=t(T+S)(x)(T+S)(tx) = t(T+S)(x) for all xVx\in V and tFt\in\F. This proves that T+ST+S is linear, i.e. T+SL(V,W)T+S\in\mathcal{L}(V,W).

The same kind of computations also show that aTL(V,W)aT\in\cL(V,W).

Yet more routine computations prove that L(V,W)\cL(V,W) satisfies the vector space axioms.

The case in which V=WV=W is special because if T,S:VVT,S:V\to V then not only are T+ST+S and aTaT linear transformations VVV\to V, but the compositions STST and TSTS also are linear transformations from VV to itself.

Inverses and other powers of a linear transformation

Definition. A linear transformation T:VWT:V\to W is called invertible if TT is both injective and surjective.

Theorem. If T:VWT:V\to W is linear and invertible, then T1:WVT^{-1}:W\to V is also linear.

Proof

By definition T1(y)=x    y=T(x)T^{-1}(y)=x \iff y=T(x) for all xVx\in V, yWy\in W. To show that T1T^{-1} is additive, let y1,y2Wy_1, y_2\in W be given, and define x1,x2Vx_1,x_2\in V by x1=T1y1x_1=T^{-1}y_1, x2=T1y2x_2=T^{-1}y_2. Then

T(x1+x2)=Tx1+Tx2=y1+y2    x1+x2=T1(y1+y2)    T1y1+T1y2=T1(y1+y2).\begin{aligned} T(x_1+x_2) = Tx_1 + Tx_2 = y_1+y_2 &\implies x_1+x_2 = T^{-1}(y_1+y_2) \\ &\implies T^{-1}y_1+T^{-1}y_2 = T^{-1}(y_1+y_2). \end{aligned}

So T1T^{-1} is indeed additive.

Similar arguments show that T1(ay)=aT1(y)T^{-1}(ay)=aT^{-1}(y) for all aFa\in \F, yWy\in W.

Theorem. If T:VWT:V\to W is invertible and dimV<\dim V\lt \infty then dimW=dimV\dim W=\dim V.

Definition. If T:VVT:V\to V is linear, then TkT^k, the kthk^{\rm th} power of TT is defined by

Tk=TTTTk factorsT^k = \overbrace{T\cdot T\cdot T\cdots T}^{k\text{ factors}}

if kk is a positive integer. If TT is invertible, then one also defines Tk=(T1)kT^{-k} = \bigl(T^{-1}\bigr)^k.

Theorem. Tk+l=TkTlT^{k+l} = T^k T^l for all k,lNk,l\in\N and all TL(V)T\in \cL(V).

Solving linear equations

If A:VWA:V\to W is linear and invertible, then the equation

Ax=yAx=y

has a unique solution xVx\in V for each yWy\in W. The solution is

x=A1yx=A^{-1}y

Example: compute the matrix of the inverse

Let A:R2R2A:\R^2\to\R^2 be given by

A(x1x2)=(1225)(x1x2)A\mat x_1\\x_2\rix = \mat 1 & 2\\ 2& 5 \rix \mat x_1 \\ x_2 \rix

Is AA invertible, and if it is, compute the matrix of A1A^{-1}.