Vector Spaces¶
Algebraic structures¶
In mathematics, the term algebraic structure refers to an arbitrary set with one or more operations defined on it. Simpler algebraic structures include groups, rings, and fields. More complex algebraic structures like vector spaces are built on top of the simpler structures. We will develop the notion of vector spaces as a progression of these algebraic structures.
Groups¶
A group is a set with a single binary operation. It is one of the simplest algebraic structures.
Let \(G\) be a set and let \(*\) be a binary operation defined on \(G\) as:
such that the binary operation \(*\) satisfies following requirements.
[Closure] The set \(G\) is closed under the binary operation \(*\). i.e.
\[\forall g_1, g_2 \in G, g_1 * g_2 \in G.\][Associativity] For every \(g_1, g_2, g_3 \in G\)
\[g_1 * (g_2 * g_3) = (g_1 * g_2) * g_3\][Identity element] There exists an element \(e \in G\) such that
\[g * e = e * g = g \quad \forall g \in G\][Inverse element] For every \(g \in G\) there exists an element \(g^{-1} \in G\) such that
\[g * g^{-1} = g^{-1} * g = e\]
Then the set \(G\) together with the operator \(*\) denoted as \((G, *)\) is known as a group.
Above requirements are known as group axioms. Note that commutativity is not a requirement of a group.
In the sequel we will write \(g_1 * g_2\) as \(g_1 g_2\).
Commutative groups¶
A commutative group is a richer structure than a group. Its elements also satisfy commutativity property.
Let \((G, *)\) be a group such that it satisfies
- [Commutativity] For every \(g_1, g_2 \in G\)
Then \((G,*)\) is known as a commutative group or an Abelian group.
In the sequel we may simply write a group \((G, *)\) as \(G\) when the underlying operation \(*\) is clear from context.
Rings¶
A ring is a set with two binary operations defined over it with some requirements as described below.
Let \(R\) be a set with two binary operations \(+\) (addition) and \(\cdot\) (multiplication) defined over it as:
such that \((R, +, \cdot)\) satisfies following requirements:
\((R, +)\) is an Abelian group.
\(R\) is closed under multiplication.
\[r_1 \cdot r_2 \in R \quad \forall r_1, r_2 \in R\]Multiplication is associative.
\[r_1 \cdot (r_2 \cdot r_3) = (r_1 \cdot r_2) \cdot r_3 \quad \forall r_1, r_2, r_3 \in R\]Multiplication distributes over addition.
\[\begin{split}\begin{aligned} &r_1 \cdot (r_2 + r_3) = (r_1 \cdot r_2) + (r_1 \cdot r_3) \quad \forall r_1, r_2, r_3 \in R\\ &(r_1 + r_2) \cdot r_3 = (r_1 \cdot r_3) + (r_2 \cdot r_3) \quad \forall r_1, r_2, r_3 \in R \end{aligned}\end{split}\]
Then \((R, +, \cdot)\) is known as an associative ring.
We denote the identity element for \(+\) as \(0\) and call it additive identity.
In the sequel we will write \(r_1 \cdot r_2\) as \(r_1 r_2\).
We may simply write a ring \((R, +, \cdot)\) as \(R\) when the underlying operations \(+,\cdot\) are clear from context.
There is a hierarchy of ring like structures. In particular we mention:
- Associative ring with identity
- Field
Let \((R, +, \cdot)\) be an associative ring such that it satisfies following additional requirement:
There exists an element \(1 \in R\) (known as multiplicative identity) such that
\[1 \cdot r = r \cdot 1 = r \quad \forall r \in R\]
Then \((R, +, \cdot)\) is known as an associative ring with identity.
Fields¶
Field is the richest algebraic structure on one set with two operations.
Let \(F\) be a set with two binary operations \(+\) (addition) and \(\cdot\) (multiplication) defined over it as:
such that \((F, +, \cdot)\) satisfies following requirements:
\((F, +)\) is an Abelian group (with additive identity as \(0 \in F\)).
\((F \setminus \{0\}, \cdot)\) is an Abelian group (with multiplicative identity as \(1 \in F\)).
Multiplication distributes over addition:
\[\alpha \cdot (\beta + \gamma) = (\alpha \cdot \beta) + (\alpha \cdot \gamma) \quad \forall \alpha, \beta, \gamma \in F\]
Then \((F, +, \cdot)\) is known as a field.
- The set of real numbers \(\RR\) is a field.
- The set of complex numbers \(\CC\) is a field.
- The Galois field GF-2 is the the set \(\{ 0, 1 \}\) with modulo-2 additions and multiplications.
Vector space¶
We are now ready to define a vector space. A vector space involves two sets. One set \(\VV\) contains the vectors. The other set \(\mathrm{F}\) (a field) contains scalars which are used to scale the vectors.
A set \(\VV\) is called a vector space over the field \(\mathrm{F}\) (or an \(\mathrm{F}\)-vector space) if there exist two mappings
which satisfy following requirements:
\((\VV, +)\) is an Abelian group.
Scalar multiplication \(\cdot\) distributes over vector addition \(+\):
\[\alpha (v_1 + v_2) = \alpha v_1 + \alpha v_2 \quad \forall \alpha \in \mathrm{F}; \forall v_1, v_2 \in \VV.\]Addition in \(\mathrm{F}\) distributes over scalar multiplication \(\cdot\):
\[( \alpha + \beta) v = (\alpha v) + (\beta v) \quad \forall \alpha, \beta \in \mathrm{F}; \forall v \in \VV.\]Multiplication in \(\mathrm{F}\) commutes over scalar multiplication:
\[(\alpha \beta) \cdot v = \alpha \cdot (\beta \cdot v) = \beta \cdot (\alpha \cdot v) = (\beta \alpha) \cdot v \quad \forall \alpha, \beta \in \mathrm{F}; \forall v \in \VV.\]Scalar multiplication from multiplicative identity \(1 \in \mathrm{F}\) satisfies the following:
\[1 v = v \quad \forall v \in \VV.\]
Some remarks are in order:
- \(\VV\) as defined above is also known as an \(\mathrm{F}\) vector space.
- Elements of \(\VV\) are known as vectors.
- Elements of \(\mathrm{F}\) are known as scalars.
- There are two \(0\) involved: \(0 \in \mathrm{F}\) and \(0 \in \VV\). It should be clear from context which \(0\) is being referred to.
- \(0 \in \VV\) is known as the zero vector.
- All vectors in \(\VV \setminus \{0\}\) are non-zero vectors.
- We will typically denote elements of \(\mathrm{F}\) by \(\alpha, \beta, \dots\).
- We will typically denote elements of \(\VV\) by \(v_1, v_2, \dots\).
We quickly look at some vector spaces which will appear again and again in our discussions.
Let \(\mathrm{F}\) be some field.
The set of all \(N\)-tuples \((a_1, a_2, \dots, a_N)\) with \(a_1, a_2, \dots, a_N \in \mathrm{F}\) is denoted as \(\mathrm{F}^N\). This is a vector space with the operations of coordinate-wise addition and scalar multiplication.
Let \(u, v \in \mathrm{F}^N\) with
and
Addition is defined as
Let \(c \in \mathrm{F}\). Scalar multiplication is defined as
\(u, v\) are called equal if \(u_1 = v_1, \dots, u_N = v_N\).
In matrix notation, vectors in \(\mathrm{F}^N\) are also written as row vectors
or column vectors
Let \(\mathrm{F}\) be some field. A matrix is an array of the form
with \(M\) rows and \(N\) columns where \(a_{ij} \in \mathrm{F}\).
The set of these matrices is denoted as \(\mathrm{F}^{M \times N}\) which is a vector space with operations of matrix addition and scalar multiplication.
Let \(A, B \in \mathrm{F}^{M \times N}\). Matrix addition is defined by
Let \(c \in \mathrm{F}\). Scalar multiplication is defined by
Let \(\mathrm{F}[x]\) denote the set of all polynomials with coefficients drawn from field \(\mathrm{F}\). i.e. if \(f(x) \in \mathrm{F}[x]\), then it can be written as
where \(a_i \in \mathrm{F}\).
The set \(\mathrm{F}[x]\) is a vector space with usual operations of addition and scalar multiplication
Some useful results are presented without proof.
This is known as the cancellation law of vector spaces.
In a vector space \(\VV\) the following statements are true
- \(0x = 0 \Forall x \in \VV\).
- \((-a)x = - (ax) = a(-x) \Forall a \in \mathrm{F} \text{ and } x \in \VV\).
- \(a 0 = 0 \Forall a \in \mathrm{F}\).
Linear independence¶
A linear combination of two vectors \(v_1, v_2 \in \VV\) is defined as
where \(\alpha, \beta \in \mathrm{F}\).
A linear combination of \(p\) vectors \(v_1,\dots, v_p \in \VV\) is defined as
Let \(\VV\) be a vector space and let \(S\) be a nonempty subset of \(\VV\). A vector \(v \in \VV\) is called a linear combination of vectors of \(S\) if there exist a finite number of vectors \(s_1, s_2, \dots, s_n \in S\) and scalars \(a_1, \dots, a_N\) in \(\mathrm{F}\) such that
We also say that \(v\) is a linear combination of \(s_1, s_2, \dots, s_n\) and \(a_1, a_2, \dots, a_n\) are the coefficients of linear combination.
Note that \(0\) is a trivial linear combination of any subset of \(\VV\).
Note that linear combination may refer to the expression itself or its value. e.g. two different linear combinations may have same value.
Note that a linear combination always consists of a finite number of vectors.
A finite set of non-zero vectors \(\{v_1, \cdots, v_p\} \subset \VV\) is called linearly dependent if there exist \(\alpha_1,\dots,\alpha_p \in \mathrm{F}\) not all \(0\) such that
A set \(S \subseteq \VV\) is called linearly dependent if there exist a finite number of distinct vectors \(u_1, u_2, \dots, u_n \in S\) and scalars \(a_1, a_2, \dots, a_n \in \mathrm{F}\) not all zero, such that
More specifically a finite set of non-zero vectors \(\{v_1, \cdots, v_n\} \subset \VV\) is called linearly independent if
- The empty set is linearly independent.
- A set of a single non-zero vector \(\{v\}\) is always linearly independent. Prove!
- If two vectors are linearly dependent, we say that they are collinear.
- Alternatively if two vectors are linearly independent, we say that they are not collinear.
- If a set \(\{v_1, \cdots, v_p\}\) is linearly independent, then any subset of it will be linearly independent. Prove!
- Adding another vector \(v\) to the set may make it linearly dependent. When?
- It is possible to have an infinite set to be linearly independent. Consider the set of polynomials \(\{1, x, x^2, x^3, \dots\}\). This set is infinite, yet linearly independent.
Span¶
Vectors can be combined to form other vectors. It makes sense to consider the set of all vectors which can be created by combining a given set of vectors.
Let \(S \subset \VV\) be a subset of vectors. The span of \(S\) denoted as \(\langle S \rangle\) or \(\Span(S)\) is the set of all possible linear combinations of vectors belonging to \(S\).
For convenience we define \(\Span(\EmptySet) = \{ 0 \}\).
Span of a finite set of vectors \(\{v_1, \cdots, v_p\}\) is denoted by \(\langle v_1, \cdots, v_p \rangle\).
We say that a set of vectors \(S \subseteq \VV\) spans \(\VV\) if \(\langle S \rangle = \VV\).
Let \(S \subset \VV\). We say that \(S\) spans (or generates) \(\VV\) if
In this case we also say that vectors of \(S\) span (or generate) \(\VV\).
Basis¶
- Since \(\Span(\EmptySet) = \{ 0 \}\) and \(\EmptySet\) is linearly independent, \(\EmptySet\) is a basis for the zero vector space \(\{ 0 \}\).
- The basis \(\{ e_1, \dots, e_N\}\) with \(e_1 = (1, 0, \dots, 0)\), \(e_2 = (0, 1, \dots, 0)\), \(\dots\), \(e_N = (0, 0, \dots, 1)\), is called the standard basis for \(\mathrm{F}^N\).
- The set \(\{1, x, x^2, x^3, \dots\}\) is the standard basis for \(\mathrm{F}[x]\). Indeed, an infinite basis. Note that though the basis itself is infinite, yet every polynomial \(p \in \mathrm{F}[x]\) is a linear combination of finite number of elements from the basis.
We review some properties of bases.
Let \(\VV\) be a vector space and \(\mathcal{B} = \{ v_1, v_2, \dots, v_n\}\) be a subset of \(\VV\). Then \(\mathcal{B}\) is a basis for \(\VV\) if and only if each \(v \in \VV\) can be uniquely expressed as a linear combination of vectors of \(\mathcal{B}\):
for unique scalars \(a_1, \dots, a_n\).
This theorem states that a basis \(\mathrm{B}\) provides a unique representation to each vector \(v \in \VV\) where the representation is defined as the \(n\)-tuple \((a_1, a_2, \dots, a_n)\).
If the basis is infinite, then the above theorem needs to be modified as follows:
Let \(\VV\) be a vector space and \(\mathcal{B}\) be a subset of \(\VV\). Then \(\mathcal{B}\) is a basis for \(\VV\) if and only if each \(v \in \VV\) can be uniquely expressed as a linear combination of vectors of \(\mathcal{B}\):
for unique scalars \(a_1, \dots, a_n\) and unique vectors \(v_1, v_2, \dots v_n \in \mathcal{B}\).
Let \(\VV\) be a vector space that is spanned by a set \(G\) containing exactly \(n\) vectors. Let \(L\) be a linearly independent subset of \(\VV\) containing exactly \(m\) vectors.
Then \(m \leq n\) and there exists a subset \(H\) of \(G\) containing exactly \(n-m\) vectors such that \(L \cup H\) spans \(\VV\).
A vector space \(\VV\) is called finite-dimensional if it has a basis consisting of a finite number of vectors. This unique number of vectors in any basis \(\mathcal{B}\) of the vector space \(\VV\) is called the dimension or dimensionality of the vector space. It is denoted as \(\dim \VV\). We say:
If \(\VV\) is not finite-dimensional, then we say that \(\VV\) is infinite-dimensional.
- Dimension of \(\mathrm{F}^N\) is \(N\).
- Dimension of \(\mathrm{F}^{M \times N}\) is \(MN\).
- The vector space of polynomials \(\mathrm{F}[x]\) is infinite dimensional.
Let \(\VV\) be a vector space with dimension \(n\).
- Any finite spanning set for \(\VV\) contains at least \(n\) vectors, and a spanning set that contains exactly \(n\) vectors is a basis for \(\VV\).
- Any linearly independent subset of \(\VV\) that contains exactly \(n\) vectors is a basis for \(\VV\).
- Every linearly independent subset of \(\VV\) can be extended to a basis for \(\VV\).
Typically we will write an ordered basis as \(\BBB = \{ v_1, v_2, \dots, v_n\}\) and assume that the basis vectors are ordered in the order they appear.
With the help of an ordered basis, we can define a coordinate vector.
Let \(\BBB = \{ v_1, \dots, v_n\}\) be an ordered basis for \(\VV\), and for \(x \in \VV\), let \(\alpha_1, \dots, \alpha_n\) be unique scalars such that
The coordinate vector of \(x\) relative to \(\BBB\) is defined as
Subspace¶
Let \(W\) be a subset of \(\VV\). Then \(W\) is called a subspace if \(W\) is a vector space in its own right under the same vector addition \(+\) and scalar multiplication \(\cdot\) operations. i.e.
are defined by restricting \(+ : \VV \times \VV \to \VV\) and \(\cdot : \VV \times \VV \to \VV\) to \(W\) and \(W\) is closed under these operations.
- \(\VV\) is a subspace of \(\VV\).
- \(\{0\}\) is a subspace of any \(\VV\).
A subset \(\WW \subseteq \VV\) is a subspace of \(\VV\) if and only if
- \(0 \in\WW\)
- \(x + y \in\WW\) whenever \(x, y \in\WW\)
- \(\alpha x \in\WW\) whenever \(\alpha \in \mathrm{F}\) and \(x \in\WW\).
A matrix \(M \in \mathrm{F}^{M \times N}\) is symmetric if
The set of symmetric matrices forms a subspace of set of all \(M\times N\) matrices.
A matrix \(M\) is called a diagonal if \(M_{ij} = 0\) whenever \(i \neq j\).
The set of diagonal matrices is a subspace of \(\mathrm{F}^{M \times N}\).
We note that a union of subspaces is not necessarily a subspace, since it is not closed under addition.
The span of a set \(S \subset \VV\) given by \(\langle S \rangle\) is a subspace of \(\VV\).
Moreover any subspace of \(\VV\) that contains \(S\) must also contain the span of \(S\).
This theorem is quite useful. It allows us to construct subspaces from a given basis.
Let \(\mathcal{B}\) be a basis of an \(n\) dimensional space \(\VV\). There are \(n\) vectors in \(\mathcal{B}\). We can create \(2^n\) distinct subsets of \(\mathcal{B}\). Thus we can construct \(2^n\) distinct subspaces of \(\VV\).
Choosing some other basis lets us construct another set of subspaces.
An \(n\)-dimensional vector space has infinite number of bases. Correspondingly, there are infinite possible subspaces.
If \(W_1\) and \(W_2\) are two subspaces of \(\VV\) then we say that \(W_1\) is smaller than \(W_2\) if \(W_1 \subset\WW _2\).
Let \(\WW\) be the smallest subspace containing vectors \(\{ v_1, \dots, v_p \}\). Then
i.e. \(\WW\) is same as the span of \(\{ v_1, \dots, v_p \}\).
Let \(\WW\) be a subspace of a finite-dimensional vector space \(\VV\). Then \(\WW\) is finite dimensional and
Moreover, if
then \(\WW = \VV\).
Let \(\VV\) be a finite dimensional vector space and \(\WW\) be a subspace of \(\VV\). The codimension of \(\WW\) is defined as