Linear transformations

In this section, we will be using symbols \(\VV\) and \(\WW\) to represent arbitrary vector spaces over a field \(\FF\). Unless specified the two vector spaces won’t be related in any way.

Following results can be restated for more general situations where \(\VV\) and \(\WW\) are defined over different fields, but we will assume that they are defined over the same field \(\FF\) for simplicity of discourse.

Definition

We call a map \(\TT : \VV \to \WW\) a linear transformation from \(\VV\) to \(\WW\) if for all \(x, y \in \VV\) and \(\alpha \in \FF\), we have

  • \(\TT(x + y) = \TT(x) + \TT(y)\) and
  • \(\TT(\alpha x) = \alpha \TT(x)\)

A linear transformation is also known as a linear map or a linear operator. Usually when the domain (\(\VV\)) and co-domain (\(\WW\)) for a linear transformation are same, then the term linear operator is used.

Remark
If \(\TT\) is linear then \(\TT(0) = 0\).

This is straightforward since

\[\TT(0 + 0) = \TT(0) + \TT(0) \implies \TT(0) = \TT(0) + \TT(0) \implies \TT(0) = 0.\]
Lemma
\(\TT\) is linear \(\iff \TT(\alpha x + y) = \alpha \TT(x) + \TT(y) \Forall x, y \in \VV, \alpha \in \FF\)
Proof

Assuming \(\TT\) to be linear we have

\[\TT(\alpha x + y) = \TT(\alpha x) + \TT(y) = \alpha \TT(x) + \TT(y).\]

Now for the converse, assume

\[\TT(\alpha x + y) = \alpha \TT(x) + \TT(y) \Forall x, y \in \VV, \alpha \in \FF.\]

Choosing both \(x\) and \(y\) to be 0 and \(\alpha=1\) we get

\[\TT(0 + 0) = \TT(0) + \TT(0) \implies \TT(0) = 0.\]

Choosing \(y=0\) we get

\[\TT(\alpha x + 0) = \alpha \TT(x) + \TT(0) = \alpha \TT(x).\]

Choosing \(\alpha = 1\) we get

\[\TT(x + y) = \TT(x) + \TT(y).\]

Thus \(\TT\) is a linear transformation.

Remark
If \(\TT\) is linear then \(\TT(x - y) = \TT(x) - \TT(y)\)
\[\TT(x - y) = \TT(x + (-1)y) = \TT(x) + \TT((-1)y) = \TT(x) +(-1)\TT(y) = \TT(x) - \TT(y).\]
Remark

\(\TT\) is linear \(\iff\) for \(x_1, \dots, x_n \in \VV\) and \(\alpha_1, \dots, \alpha_n \in \FF\),

\[\TT\left (\sum_{i=1}^{n} \alpha_i x_i \right ) = \sum_{i=1}^{n} \alpha_i \TT(x_i).\]

We can use mathematical induction to prove this.

Some special linear transformations need mention.

Definition

The identity transformation \(\mathrm{I}_{\VV} : \VV \to \VV\) is defined as

\[\mathrm{I}_{\VV}(x) = x, \Forall x \in \VV.\]
Definition

The zero transformation \(\mathrm{0} : \VV \to \WW\) is defined as

\[0(x) = 0, \Forall x \in \VV.\]

In this definition \(0\) is taking up multiple meanings: a linear transformation from \(\VV\) to \(\WW\) which maps every vector in \(\VV\) to the \(0\) vector in \(\WW\).

From the context usually it should be obvious whether we are talking about \(0 \in \FF\) or \(0 \in \VV\) or \(0 \in \WW\) or \(0\) as a linear transformation from \(\VV\) to \(\WW\).

Null space and range

Definition

The null space or kernel of a linear transformation \(\TT : \VV \to \WW\) denoted by \(\NullSpace(\TT)\) or \(\Kernel(\TT)\) is defined as

\[\Kernel(\TT) = \NullSpace(\TT) \triangleq \{ x \in \VV : \TT(x) = 0\}\]
Theorem
The null space of a linear transformation \(\TT : \VV \to \WW\) is a subspace of \(\VV\).
Proof

Let \(v_1, v_2 \in \Kernel(\TT)\). Then

\[\TT(\alpha v_1 + v_2) = \alpha \TT(v_1) + \TT(v_2) = \alpha 0 + 0 = 0.\]

Thus \(\alpha v_1 + v_2 \in \Kernel(\TT)\). Thus \(\Kernel(\TT)\) is a subspace of \(\VV\).

Definition

The range or image of a linear transformation \(\TT : \VV \to \WW\) denoted by \(\Range(\TT)\) or \(\Image(\TT)\) is defined as

\[\Range(\TT) = \Image(\TT) \triangleq \{\TT(x) \Forall x \in \VV \}.\]

We note that \(\Image(\TT) \subseteq \WW\).

Theorem
The image of a linear transformation \(\TT : \VV \to \WW\) is a subspace of \(\WW\).
Proof

Let \(w_1, w_2 \in \Image(\TT)\). Then there exist \(v_1, v_2 \in \VV\) such that

\[w_1 = \TT(v_1); w_2 = \TT(v_2).\]

Thus

\[\alpha w_1 + w_2 = \alpha \TT(v_1) + \TT(v_2) = \TT(\alpha v_1 + v_2).\]

Thus \(\alpha w_1 + w_2 \in \Image(\TT)\). Hence \(\Image(\TT)\) is a subspace of \(\WW\).

Theorem

Let \(\TT : \VV \to \WW\) be a linear transformation. Let \(\mathcal{B} = \{v_1, v_2, \dots, v_n\}\) be some basis of \(\VV\). Then

\[\Image(\TT) = \langle \TT(\mathcal{B}) \rangle = \langle\{\TT(v_1), \TT(v_2), \dots, \TT(v_n) \} \rangle.\]

i.e. The image of a basis of \(\VV\) under a linear transformation \(\TT\) spans the range of the transformation.

Proof

Let \(w\) be some arbitrary vector in \(\Image(\TT)\). Then there exists \(v \in \VV\) such that \(w = \TT(v)\). Now

\[v = \sum_{i=1}^n c_i v_i\]

since \(\mathcal{B}\) forms a basis for \(\VV\).

Thus

\[w = \TT(v) = \TT(\sum_{i=1}^n c_i v_i) = \sum_{i=1}^n c_i(\TT(v_i)).\]

This means that \(w \in \langle \TT(\mathcal{B}) \rangle\).

Definition

For vector spaces \(\VV\) and \(\WW\) and linear \(\TT : \VV \to \WW\) if \(\Kernel{\TT}\) is finite dimensional then nullity of \(\TT\) is defined as

\[\Nullity(\TT) = \dim \Kernel(\TT)\]

i.e. the dimension of the null space or kernel of \(\TT\).

Definition

For vector spaces \(\VV\) and \(\WW\) and linear \(\TT : \VV \to \WW\) if \(\Image{\TT}\) is finite dimensional then rank of \(\TT\) is defined as

\[\Rank(\TT) = \dim \Image(\TT)\]

i.e. the dimension of the range or image of \(\TT\).

Theorem

For vector spaces \(\VV\) and \(\WW\) and linear \(\TT : \VV \to \WW\) if \(\VV\) is finite dimensional, then

\[\dim \VV = \Nullity(\TT) + \Rank(\TT).\]

This is known as dimension theorem.

Theorem
For vector spaces \(\VV\) and \(\WW\) and linear \(\TT : \VV \to \WW\), \(\TT\) is one-one if and only if \(\Kernel(\TT) = \{ 0\}\).
Proof

If \(\TT\) is one-one, then

\[v_1 \neq v_2 \implies T(v_1) \neq T(v_2)\]

Let \(v \neq 0\). Now \(\TT(0) = 0 \implies \TT(v) \neq 0\) since \(\TT\) is one-one. Thus \(\Kernel(\TT) = \{ 0\}\).

For converse let us assume that \(\Kernel(\TT) = \{ 0\}\). Let \(v_1, v_2 \in V\) be two vectors in \(V\) such that

\[\begin{split}&\TT(v_1) = \TT(v_2) \\ \implies &\TT(v_1 - v_2) = 0 \\ \implies &v_1 - v_2 \in \Kernel(\TT)\\ \implies &v_1 - v_2 = 0 \\ \implies &v_1 = v_2.\end{split}\]

Thus \(\TT\) is one-one.

Theorem

For vector spaces \(\VV\) and \(\WW\) of equal finite dimensions and linear \(\TT : \VV \to \WW\), the following are equivalent.

  1. \(\TT\) is one-one.
  2. \(\TT\) is onto.
  3. \(\Rank(\TT) = \dim (\VV)\).
Proof

From (a) to (b)

Let \(\mathcal{B} = \{v_1, v_2, \dots v_n \}\) be some basis of \(\VV\) with \(\dim \VV = n\).

Let us assume that \(\TT(\mathcal{B})\) are linearly dependent. Thus there exists a linear relationship

\[\sum_{i=1}^{n}\alpha_i \TT(v_i) = 0\]

where \(\alpha_i\) are not all 0.

Now

\[\begin{split}&\sum_{i=1}^{n}\alpha_i \TT(v_i) = 0 \\ \implies &\TT\left(\sum_{i=1}^{n}\alpha_i v_i\right) = 0\\ \implies &\sum_{i=1}^{n}\alpha_i v_i \in \Kernel(\TT)\\ \implies &\sum_{i=1}^{n}\alpha_i v_i = 0\end{split}\]

since \(\TT\) is one-one. This means that \(v_i\) are linearly dependent. This contradicts our assumption that \(\mathcal{B}\) is a basis for \(\VV\).

Thus \(\TT(\mathcal{B})\) are linearly independent.

Since \(\TT\) is one-one, hence all vectors in \(\TT(\mathcal{B})\) are distinct, hence

\[| \TT(\mathcal{B}) | = n.\]

Since \(\TT(\mathcal{B})\) span \(\Image(\TT)\) and are linearly independent, hence they form a basis of \(\Image(\TT)\). But

\[\dim \VV = \dim \WW = n\]

and \(\TT(\mathcal{B})\) are a set of \(n\) linearly independent vectors in \(\WW\).

Hence \(\TT(\mathcal{B})\) form a basis of \(\WW\). Thus

\[\Image(\TT) = \langle \TT(\mathcal{B}) \rangle = \WW.\]

Thus \(\TT\) is on-to.

From (b) to (c) \(\TT\) is on-to means \(\Image(\TT) = \WW\) thus

\[\Rank(\TT) = \dim \WW = \dim \VV.\]

From (c) to (a) We know that

\[\dim \VV = \Rank(\TT) + \Nullity(\TT).\]

But it is given that \(\Rank(\TT) = \dim \VV\). Thus

\[\Nullity(\TT) = 0.\]

Thus \(\TT\) is one-one.

Bracket operator

Recall the definition of coordinate vector from here. Conversion of a given vector to its coordinate vector representation can be shown to be a linear transformation.

Definition

Let \(\VV\) be a finite dimensional vector space over a field \(\FF\) where \(\dim \VV = n\). Let \(\BBB = \{ v_1, \dots, v_n\}\) be an ordered basis in \(\VV\). We define a bracket operator from \(\VV\) to \(\FF^n\) as

\[\begin{split}\begin{aligned} \Bracket_{\BBB} : &\VV \to \FF^n\\ & x \to [x]_{\BBB}\\ & \triangleq \begin{bmatrix} \alpha_1\\ \vdots\\ \alpha_n \end{bmatrix} \end{aligned}\end{split}\]

where

\[x = \sum_{i=1}^n \alpha_i v_i.\]

In other words, the bracket operator takes a vector \(v\) from a finite dimensional space \(\VV\) to its representation in \(\FF^n\) for a given basis \(\BBB\).

We now show that the bracket operator is linear.

Theorem

Let \(\VV\) be a finite dimensional vector space over a field \(\FF\) where \(\dim \VV = n\). Let \(\BBB = \{ v_1, \dots, v_n\}\) be an ordered basis in \(\VV\). The bracket operator \(\Bracket_{\BBB} : \VV \to \FF^n\) as defined here is a linear operator.

Moreover \(\Bracket_{\BBB}\) is a one-one and onto mapping.

Proof

Let \(x, y \in \VV\) such that

\[x = \sum_{i=1}^n \alpha_i v_i.\]

and

\[y = \sum_{i=1}^n \beta_i v_i.\]

Then

\[c x + y = c \sum_{i=1}^n \alpha_i v_i + \sum_{i=1}^n \beta_i v_i = \sum_{i=1}^n (c \alpha_i + \beta_i ) v_i.\]

Thus

\[\begin{split}[c x + y]_{\BBB} = \begin{bmatrix} c \alpha_1 + \beta_1 \\ \vdots\\ c \alpha_n + \beta_n \end{bmatrix} = c \begin{bmatrix} \alpha_1 \\ \vdots\\ \alpha_n \end{bmatrix} + \begin{bmatrix} \beta_1 \\ \vdots\\ \beta_n \end{bmatrix} = c [x]_{\BBB} + [y]_{\BBB}.\end{split}\]

Thus \(\Bracket_{\BBB}\) is linear.

We can see that by definition \(\Bracket_{\BBB}\) is one-one. Now since \(\dim \VV = n = \dim \FF^n\) hence \(\Bracket_{\BBB}\) is on-to due to here.

Matrix representations

It is much easier to work with a matrix representation of a linear transformation. In this section we describe how matrix representations of a linear transformation are developed.

In order to develop a representation for the map \(\TT : \VV \to \WW\) we first need to choose a representation for vectors in \(\VV\) and \(\WW\). This can be easily done by choosing a basis in \(\VV\) and another in \(\WW\). Once the bases are chosen, then we can represent vectors as coordinate vectors.

Definition

Let \(\VV\) and \(\WW\) be finite dimensional vector spaces with ordered bases \(\BBB = \{v_1, \dots, v_n\}\) and \(\Gamma = \{w_1, \dots,w_m\}\) respectively. Let \(\TT : \VV \to \WW\) be a linear transformation. For each \(v_j \in \BBB\) we can find a unique representation for \(\TT(v_j)\) in \(\Gamma\) given by

\[\TT(v_j) = \sum_{i=1}^{m} a_{ij} w_i \Forall 1 \leq j \leq n.\]

The \(m\times n\) matrix \(A\) defined by \(A_{ij} = a_{ij}\) is the matrix representation of \(\TT\) in the ordered bases \(\BBB\) and \(\Gamma\), denoted as

\[A = [\TT]_{\BBB}^{\Gamma}.\]

If \(\VV = \WW\) and \(\BBB = \Gamma\) then we write

\[A = [\TT]_{\BBB}.\]

The \(j\)-th column of \(A\) is the representation of \(\TT(v_j)\) in \(\Gamma\).

In order to justify the matrix representation of \(\TT\) we need to show that application of \(\TT\) is same as multiplication by \(A\). This is stated formally below.

Theorem
\[[\TT (v)]_{\Gamma} = [\TT]_{\BBB}^{\Gamma} [v]_{\BBB} \Forall v \in \VV.\]
Proof

Let

\[v = \sum_{j=1}^{n} c_j v_j.\]

Then

\[\begin{split}[v]_{\BBB} = \begin{bmatrix} c_1\\ \vdots\\ c_n \end{bmatrix}\end{split}\]

Now

\[\begin{split}\TT(v) &= \TT\left( \sum_{j=1}^{n} c_j v_j \right)\\ &= \sum_{j=1}^{n} c_j \TT(v_j)\\ &= \sum_{j=1}^{n} c_j \sum_{i=1}^{m} a_{ij} w_i\\ &= \sum_{i=1}^{m} \left ( \sum_{j=1}^{n} a_{ij} c_j \right ) w_i\\\end{split}\]

Thus

\[\begin{split}[\TT (v)]_{\Gamma} = \begin{bmatrix} \sum_{j=1}^{n} a_{1 j} c_j\\ \vdots\\ \sum_{j=1}^{n} a_{m j} c_j \end{bmatrix} = A \begin{bmatrix} c_1\\ \vdots\\ c_n \end{bmatrix} = [\TT]_{\BBB}^{\Gamma} [v]_{\BBB}.\end{split}\]

Vector space of linear transformations

If we consider the set of linear transformations from \(\VV\) to \(\WW\) we can impose some structure on it and take its advantages.

First of all we will define basic operations like addition and scalar multiplication on the general set of functions from a vector space \(\VV\) to another vector space \(\WW\).

Definition

Let \(\TT\) and \(\UU\) be arbitrary functions from vector space \(\VV\) to vector space \(\WW\) over the field \(\FF\). Then addition of functions is defined as

\[(\TT + \UU)(v) = \TT(v) + \UU(v) \Forall v \in \VV.\]

Scalar multiplication on a function is defined as

\[(\alpha \TT)(v) = \alpha (\TT (v)) \Forall \alpha \in \FF, v \in \VV.\]

With these definitions we have

\[(\alpha \TT + \UU)(v) = (\alpha \TT)(v) + \UU(v) = \alpha (\TT (v)) + \UU(v).\]

We are now ready to show that with the addition and scalar multiplication as defined above, the set of linear transformations from \(\VV\) to \(\WW\) actually forms a vector space.

Theorem

Let \(\VV\) and \(\WW\) be vector spaces over field \(\FF\). Let \(\TT\) and \(\UU\) be some linear transformations from \(\VV\) to \(\WW\). Let addition and scalar multiplication of linear transformations be defined as in here. Then \(\alpha \TT + \UU\) where \(\alpha \in \FF\) is a linear transformation.

Moreover the set of linear transformations from \(\VV\) to \(\WW\) forms a vector space.

Proof

We first show that \(\alpha \TT + \UU\) is linear.

Let \(x,y \in \VV\) and \(\beta \in \FF\). Then we need to show that

\[\begin{split}(\alpha \TT + \UU) (x + y) = (\alpha \TT + \UU) (x) + (\alpha \TT + \UU) (y)\\ (\alpha \TT + \UU) (\beta x) = \beta ((\alpha \TT + \UU) (x)).\end{split}\]

Starting with the first one:

\[\begin{split}(\alpha \TT + \UU)(x + y) &= (\alpha \TT)(x + y) + \UU(x + y)\\ &= \alpha ( \TT (x + y) ) + \UU(x) + \UU(y)\\ &= \alpha \TT (x) + \alpha \TT(y) + \UU(x) + \UU(y)\\ &= (\alpha \TT) (x) + \UU (x) + (\alpha \TT)(y) + \UU(y)\\ &= (\alpha \TT + \UU)(x) + (\alpha \TT + \UU)(y).\end{split}\]

Now the next one

\[\begin{split}(\alpha \TT + \UU) (\beta x) &= (\alpha \TT ) (\beta x) + \UU (\beta x)\\ &= \alpha (\TT(\beta x)) + \beta (\UU (x))\\ &= \alpha (\beta (\TT (x))) + \beta (\UU (x))\\ &= \beta (\alpha (\TT (x))) + \beta (\UU(x))\\ &= \beta ((\alpha \TT)(x) + \UU(x))\\ &= \beta((\alpha \TT + \UU)(x)).\end{split}\]

We can now easily verify that the set of linear transformations from \(\VV\) to \(\WW\) satisfies all the requirements of a vector space. Hence its a vector space (of linear transformations from \(\VV\) to \(\WW\)).

Definition

Let \(\VV\) and \(\WW\) be vector spaces over field \(\FF\). Then the vector space of linear transformations from \(\VV\) to \(\WW\) is denoted by \(\LinTSpace(\VV, \WW)\).

When \(\VV = \WW\) then it is simply denoted by \(\LinTSpace(\VV)\).

The addition and scalar multiplication as defined in here carries forward to matrix representations of linear transformations also.

Theorem

Let \(\VV\) and \(\WW\) be finite dimensional vector spaces over field \(\FF\) with \(\BBB\) and \(\Gamma\) being their respective bases. Let \(\TT\) and \(\UU\) be some linear transformations from \(\VV\) to \(\WW\).

Then the following hold

  • \([\TT + \UU]_{\BBB}^{\Gamma} = [\TT]_{\BBB}^{\Gamma} + [\UU]_{\BBB}^{\Gamma}\)
  • \([\alpha \TT]_{\BBB}^{\Gamma} = \alpha [\TT]_{\BBB}^{\Gamma} \Forall \alpha \in \FF\)