Definition. A map T:V→W from one vector space V to another W is called linear if
for all x,y∈V and all a,b∈F one has T(ax+by)=aT(x)+bT(y).
Synonym: a linear map is also called a linear transformation or a linear operator.
In this definition it is assumed that V and W are vector spaces over the
same field F. If the two vector spaces are defined over different
fields then the concept of linear map from V to W is not defined.
Sometimes it is more convenient to split the task of verifying that a certain
linear map T:V→W is linear into two steps:
Verify that T(x+y)=T(x)+T(y) for all x,y∈V
Verify that T(ax)=aT(x) for all x∈V and a∈F.
Problem. Show that the definition implies that if T:V→W is linear, then T(0V)=0W.
Examples from geometry
Rotations in the plane. Let V=W=R2 and F=R and consider the
map R:R2→R2 defined by counterclockwise rotation by an angle of
θ radians. Show that R is linear. Find a formula for
R(x1x2).
Reflections in the plane. Again consider V=W=R2, and let
ℓ⊂R2 be some line through the origin. Define S(x) to be the
reflection of x in the line ℓ.
Show that S:R2→R2 is linear. and find a formula for S(x1x2), in the case where ℓ is the diagonal x1=x2.
Projection onto a line. Let V=R2 and W be a line through the origin in
R2. Consider the map P:V→W for which Px is the orthogonal projection
onto W. Show that P is a linear transformation.
Rigid rotation in R3. In this example, let V=W=R3, and let
T:R3→R3 be the map defined by first rotating around the z-axis over
30∘ and then rotating around the x-axis over 45∘.
Show that T:R3→R3 is linear.
We postpone finding a formula for T(x1x2x3) to the next chapter on matrices.
Examples from algebra
Let a,b,c,d∈F be given numbers in the field F and
consider the map T:F2→F2 given by
T(x1x2)=(ax1+bx2cx1+dx2)
for all (x1x2)∈F2. Show that T is linear.
Visualize the map T you get if a=2, b=c=0, d=21.
Examples from differential equations
Let W=F(R,R), and let V⊂W be the subspace of functions
that are differentiable. Consider the map D:V→W given by
(Df)(x)=f′(x).
Thus D(ex)=ex, D(sinx)=cosx, D(x2)=2x, etc.
Verify that D:V→W is linear.
Null space and range; rank
Definition. If T:V→W is a linear map, then
the Null space of T is N(T)={x∈V∣T(x)=0}
the Range of T is R(T)={T(x)∣x∈V}={y∈W∣∃x∈V:y=Tx}.
The null space is sometimes also called the kernel of the map and the notation N(T)=kerT is sometimes used.
Theorem.The Null space a linear transformation T:V→W is linear
subspace of V; the range of T is a linear subspace of W.
Proof
N(T)is a linear subspace:
0V∈N(T) because T(0V)=0W
If x,y∈N(T) then T(x+y)=T(x)+T(y)=0+0=0. Hence x+y∈N(T).
If a∈F and x∈N(T) then T(ax)=aT(x)=a⋅0W=0W. So ax∈N(T).
It follows that N(T) is not empty, and closed under addition and scalar multiplication, so that N(T) is a linear subspace.
R(T)⊂W is a linear subspace:
0W∈R(T) because T(0V)=0W
If x,y∈R(T) then then there exist u,v∈V with x=T(u), y=T(v). It follows that x+y=T(u)+T(v)=T(u+v). Hence x+y∈R(T).
If a∈F and x∈R(T) then there is a u∈V with x=T(u). It follows that ax=aT(u)=T(au). So ax∈R(T).
It follows that R(T) is not empty, and closed under addition and scalar multiplication, so that R(T) is a linear subspace.
Definition.The rank of T is the dimension of the range of T.
Injectivity Theorem. A linear map T:V→W is injective if and only if N(T)={0}.
Proof
First we show that N(T)={0} implies that T is injective.
Suppose N(T)={0}. To prove that T is injective, we have to show for all
x,y∈V that Tx=Ty implies x=y.
So let x,y∈V be given with Tx=Ty. Then T(x−y)=Tx−Ty=0. Therefore
x−y∈N(T). Since N(T)={0} it follows that x−y=0, which implies that x=y.
Next we show that if T is injective then N(T)=0
Assume T is injective. We must show that N(T) only contains the zero
vector. Suppose x∈N(T). Then Tx=0. It is always true that T(0)=0.
Since T is injective and since T(x)=T(0) we conclude that x=0. So
N(T) only contains the zero vector.
Rank+Nullity Theorem.If T:V→W is linear, and if V is finite dimensional, then
dimN(T)+dimR(T)=dimV.
Proof
Choose a basis {v1,…,vr} of the null space N(T). Then choose
vectors vr+1,…,vn∈V so that {v1,…,vr,vr+1,…,vn} is a basis for V. We will show that β={Tvr+1,…,Tvn} is a basis for R(T). The rank+nullity formula then follows because
we will have shown that dimV=n, dimN(T)=r, and dimR(T)=n−r.
β spans R(T): if y∈R(T) then there is an x∈V with y=Tx. We can write x=x1v1+⋯+xnvn. Since Tv1=⋯=Tvr=0 we have
β is linearly independent: Suppose cr+1wr+1+⋯+cnwn=0 for certain cr+1,…,cn∈F. Then
T(cr+1vr+1+⋯+cnvn)=0,
which implies cr+1vr+1+⋯+cnvn∈N(T). It follows that there are numbers c1,…,cr such that
cr+1vr+1+⋯+cnvn=c1v1+⋯+crvr.
Since {v1,…,vn} is a basis for V we conclude that c1=⋯=cn=0. Hence β is linearly independent.
Injectivity Theorem.If T:V→W is a linear transformation then T is injective if and only if N(T)={0}.
Proof
T injective implies N(T)={0}:
If x∈N(T) then T(x)=0. Since T(0)=0 we have T(0)=T(x). It follows
from injectivity of T that x=0. Hence 0 is the only vector in N(T).
N(T)={0} implies T is injective:
Suppose T(x)=T(y). Then linearity of T implies T(x−y)=T(x)−T(y)=0. By
assumption N(T)={0}, so T(x−y)=0 implies x−y=0, i.e. x=y. This
means that T is injective.
Bijectivity Theorem.
If V and W are finite dimensional vector spaces with the same dimension, and if T:V→W is a linear transformation then the following are equivalent:
T is injective (one-to-one)
N(T)={0}
rankT=dimV
T is surjective (onto)
A very important special case is when V=W and V is finite dimensional.
Proof
1⟺2: this is what the Injectivity Lemma says.
2⟺3: If N(T)={0} then dimN(T)=0 and the rank+nullity
theorem says dimV=dimR(T)+dimN(T)=dimR(T)=rankT. Conversely, if rankT=dimV then the rank+nullity theorem says that dimN(T)=0, i.e. N(T)={0}.
3⟹4: If rankT=dimV then R(T) is a subspace of V with the same dimension as W. This implies R(T)=W, i.e. T is onto.
4⟹3: If T is onto then R(T)=W, and thus rankT=dimR(T)=dimW=dimV.
Solving linear equations
Let T:V→W be a linear transformation between vector spaces, and consider
the equation
Tx=y.
Here y∈W is given and x∈V is the unknown. The standard questions are
is there a solution? and if there is a solution, how many? Linear algebra
provides the following answers:
Does Tx=y have a solution?
Tx=y has a solution if and only if y∈R(T). This is the definition of the range of a linear transformation. What does this tell us? It doesn't say we can always solve Tx=y, but the set of y for which there are solutions has a nice property — R(T) is a linear subspace of W. Therefore, if we can find a solution to Tx1=y1 and Tx2=y2 then we can also find a solution to Tz=c1y1+c2y2 (one solution is z=c1x1+c2x2 — there might be others).
What is the form of the general solution to Tx=y?
Suppose that x,x′∈V both are solutions, i.e. T(x)=T(x′)=y. Then
T(x−x′)=0, so x−x′∈N(T): so we see that the difference between any two
solutions lies in the null space of T.
Conversely, suppose that x is a solution, i.e. T(x)=y, and suppose
u∈N(T). Then T(x+u)=T(x)+T(u)=y+0=y, i.e. x+u is also a solution. Hence, given a solution x to T(x)=y one can get another solution by adding any vector u in the null space to x
Particular solutions and the homogeneous equation. If xp∈V is a
solution of T(x)=y, then every solution of T(x)=y is given by
x=xp+xhwith xh∈N(T).
In this context the following terminology is very often used:
T(x)=y is the inhomogenous equation
T(x)=0 is the homogeneous equation
xp is a particular solution
xh∈N(T) the general solution of the homogeneous equation
If r=defdimN(T)<∞, and if we know a basis {u1,…,ur} for the null space N(T), then every vector xh in the null space is
given by
xh=c1u1+⋯+crur
for certain c1,…,cr∈F. If we still know a particular solution xp of T(x)=y then the general solution to the equation T(x)=y (i.e. every solution) is given by
x=xp+c1u1+⋯+crur
where c1,…,cr∈F are arbitrary constants.
Systems of equations
Consider the linear transformation A:Fn→Fm given by
For any vector y=[y1⋮ym]∈Fm the equation Ax=y is then
equivalent with the system of linear equations for the unknowns x1,…,xn given by
a11x1+⋯+a1nxn⋮am1x1+⋯+amnxn=y1=ym
In other words, we are considering m linear equations with n unknowns.
In this setting the rank+nullity theorem says that dimN(A)+dimR(A)=dimV, i.e. dimN(A)+dimR(A)=n.
More equations than unknowns, i.e. m>n
It follows from dimN(A)+dimR(A)=n that dimR(A)=n−dimN(A)≤n<m. So in this case R(A) is always a proper subspace of W: the
equation T(x)=y does not have a solution for most y.
For those y∈Fm for which the system does have a solution the general solution
contains r constants, where r=dimN(A)=n−dimR(A).
More unknowns than equations, i.e. m<n
Since R(A)⊂Rm, we have dimR(A)≤m.
By the nullity+rank theorem,
dimN(A)=dimV−dimR(A)=n−dimR(A)≥n−m>0.
So in this case the dimension of the null space always is positive, i.e. there are nonzero solutions to Ax=0. For any y∈Fm one of the following occurs:
y∈R(A) and there is no solution
y∈R(A) and the general solution contains r free constants, where
r=dimN(A)>0.
As many equations as unknowns, i.e. m=n
We again have dimN(A)=n−dimR(A).
If A is injective then N(A)={0}, and hence dimN(A)=0, so that dimR(A)=n. Since R(A)⊂Rn, this implies that when A is injective, A also is surjective.
If on the other hand A is not injective then dimN(A)>0 and thus dimR(A)<n. In this situation R(A) is a proper subspace of Rn and therefore Ax=y does not have a solution for all y.
The components of a vector with respect to a basis
Definition.An ordered basis of a vector space V is an ordered list
of vectors β=(v1,…,vn) such that {v1,…,vn} is a
basis of V.
If β=(v1,…,vn) is an ordered basis of V and x∈V is any vector then there exist x1,…,xn∈F such that
x=x1v1+⋯+xnvn.
The numbers x1,…,xn are called the components of x with respect to the basis β. These components determine a column vector. In the notation of the text book:
[x]β=⎝⎜⎜⎛x1⋮xn⎠⎟⎟⎞∈Fn.
Instead of components, the xi are sometimes also called the coordinates of x.
Matrix representation of a linear transformation
Let T:V→W be a linear transformation, and let β={v1,…,vn}
be an ordered basis for V and γ={w1,…,wm} an ordered basis
for W. Each vector Tvi can be written as a linear combination of w1,…,wm, i.e. there exist numbers aij∈F such that
Tvi=a1iw1+⋯+amiwm(i=1,2,…,n).
Linearity of T allows us to compute T(x) if we know the coefficients
aij and the components xj of x in the basis v1,…,vn. Namely
one has:
The coefficients aij of the linear transformation T with respect to the
ordered bases β and γ form a matrix
[T]βγ=⎝⎜⎜⎛a11⋮am1⋯⋯a1n⋮amn⎠⎟⎟⎞
which is called the matrix representation of T in the bases β,γ.
Since it turns out to be easy to confuse rows and columns the following
observation may be helpful: the first column \tmat a_{11} \\ \cdot \\ \cdot \\
a_{m1} \trix of the matrix [T]βγ contains the components of Tv1
expressed in the basis {w1,…,wn}.
Example. If m=3 and n=2, so that W is three dimensional with basis
γ={w1,w2,w3} and V is two dimensional with basis β={v1,v2}, and if
Tv1=w1−3w2+5w3,Tv2=−w1,
then the matrix of T in these bases is
[T]βγ=⎝⎜⎛1−35−100⎠⎟⎞.
Special case: A:Fn→Fm
The vector space Fn has the standard basis {e1,…,en}. If the
matrix of a linear transformation A:Fm→Fn is given by (aij), then
A is given by matrix multiplication
Composition of transformations and matrix multiplication
If we have three vector spaces U,V,W and linear transformations A:V→W and B:U→V then we can define the composition AB:U→W by
(AB)(x)=defA(B(x)) for all x∈U.
Theorem. If A:V→W and B:U→V are linear transformations of vector spaces U,V,W, then the composition AB:U→W is also linear.
The proof is a homework problem.
If we have ordered bases α={u1,…,ul} for U, β={v1,…,vm} for V, and γ={w1,…,wn} for W then the matrices of A and B with respect to these bases are
Definition.If A=(aij) is an n×l matrix and B=(bjk) is an l×m matrix then the matrix product AB is defined to be the n×m matrix C=(cik) whose entries are given by
cij=ai1b1j+⋯+ailblj=k=1∑laikbkj.
With this definition we have just shown the following
Theorem.[AB]αγ=[A]βγ[B]γα
Example. Let R(θ):R2→R2 be rotation through an angle θ. Then the matrix of R(θ) is given by
R(θ)=(cosθsinθ−sinθcosθ)
If we first rotate by θ and then by ϕ we achieve the same as rotating by θ+ϕ. This implies
The set of all linear transformations from one vector space V to another W,
is itself a vector space over the same field F. Addition is defined by saying that for any two linear maps T,S:V→W one has
(T+S)(x)=defT(x)+S(x), for all x∈V,
and for any linear map T:V→W and any number a∈F one has
(aT)(x)=defa(T(x)) for all x∈V.
Notation.L(V,W)={T∣T:V→W is a linear transformation}
If V=W then one writes L(V) instead of L(V,W).
Theorem. If T,S:V→W are linear and if a∈F then the maps T+S:V→W and aT:V→W are linear.
The set L(V,W) of linear maps T:V→W is a vector space.
A similar computation shows that (T+S)(tx)=t(T+S)(x) for all x∈V and t∈F.
This proves that T+S is linear, i.e. T+S∈L(V,W).
The same kind of computations also show that aT∈L(V,W).
Yet more routine computations prove that L(V,W) satisfies the vector space axioms.
The case in which V=W is special because if T,S:V→V then not only are T+S and aT linear transformations V→V, but the compositions ST and TS also are linear transformations from V to itself.
Inverses and other powers of a linear transformation
Definition. A linear transformation T:V→W is called invertible if T is both injective and surjective.
Theorem.If T:V→W is linear and invertible, then T−1:W→V is also linear.
Proof
By definition T−1(y)=x⟺y=T(x) for all x∈V, y∈W. To show that T−1 is additive, let y1,y2∈W be given, and define x1,x2∈V by x1=T−1y1, x2=T−1y2. Then