Introduction¶

In this chapter we collect results related to matrix algebra which are relevant to this book. Some specific topics which are typically not found in standard books are also covered here.

Standard notation in this chapter is given here. Matrices are denoted by capital letters \(A\), \(B\) etc.. They can be rectangular with \(m\) rows and \(n\) columns. Their elements or entries are referred to with small letters \(a_{i j}\), \(b_{i j}\) etc. where \(i\) denotes the \(i\)-th row of matrix and \(j\) denotes the \(j\)-th column of matrix. Thus

\[\begin{split}A = \begin{bmatrix} a_{1 1} & a_{1 2} & \dots a_{1 n}\\ a_{2 1} & a_{2 2} & \dots a_{1 n}\\ \vdots & \vdots & \ddots \vdots\\ a_{m 1} & a_{m 2} & \dots a_{m n}\\ \end{bmatrix}\end{split}\]

Mostly we consider complex matrices belonging to \(\CC^{m \times n}\). Sometimes we will restrict our attention to real matrices belonging to \(\RR^{m \times n}\).

Definition

An \(m \times n\) matrix is called square matrix if \(m = n\).

Definition

An \(m \times n\) matrix is called tall matrix if \(m > n\) i.e. the number of rows is greater than columns.

Definition

An \(m \times n\) matrix is called wide matrix if \(m < n\) i.e. the number of columns is greater than rows.

Definition

Let \(A= [a_{i j}]\) be an \(m \times n\) matrix. The main diagonal consists of entries \(a_{i j}\) where \(i = j\). i.e. main diagonal is \(\{a_{11}, a_{22}, \dots, a_{k k} \}\) where \(k = \min(m, n)\). Main diagonal is also known as leading diagonal, major diagonal primary diagonal or principal diagonal. The entries of \(A\) which are not on the main diagonal are known as off diagonal entries.

Definition

A diagonal matrix is a matrix (usually a square matrix) whose entries outside the main diagonal are zero.

Whenever we refer to a diagonal matrix which is not square, we will use the term rectangular diagonal matrix.

A square diagonal matrix \(A\) is also represented by \(\Diag(a_{11}, a_{22}, \dots, a_{n n})\) which lists only the diagonal (non-zero) entries in \(A\).

The transpose of a matrix \(A\) is denoted by \(A^T\) while the Hermitian transpose is denoted by \(A^H\). For real matrices \(A^T = A^H\).

When matrices are square, we have the number of rows and columns both equal to \(n\) and they belong to \(\CC^{n \times n}\).

If not specified, the square matrices will be of size \(n \times n\) and rectangular matrices will be of size \(m \times n\). If not specified the vectors (column vectors) will be of size \(n \times 1\) and belong to either \(\RR^n\) or \(\CC^n\). Corresponding row vectors will be of size \(1 \times n\).

For statements which are valid both for real and complex matrices, sometimes we might say that matrices belong to \(\FF^{m \times n}\) while the scalars belong to \(\FF\) and vectors belong to \(\FF^n\) where \(\FF\) refers to either the field of real numbers or the field of complex numbers. Note that this is not consistently followed at the moment. Most results are written only for \(\CC^{m \times n}\) while still being applicable for \(\RR^{m \times n}\).

Identity matrix for \(\FF^{n \times n}\) is denoted as \(I_n\) or simply \(I\) whenever the size is clear from context.

Sometimes we will write a matrix in terms of its column vectors. We will use the notation

\[A = \begin{bmatrix} a_1 & a_2 & \dots & a_n \end{bmatrix}\]

indicating \(n\) columns.

When we write a matrix in terms of its row vectors, we will use the notation

\[\begin{split}A = \begin{bmatrix} a_1^T \\ a_2^T \\ \vdots \\ a_m^T \end{bmatrix}\end{split}\]

indicating \(m\) rows with \(a_i\) being column vectors whose transposes form the rows of \(A\).

The rank of a matrix \(A\) is written as \(\Rank(A)\), while the determinant as \(\det(A)\) or \(|A|\).

We say that an \(m \times n\) matrix \(A\) is left-invertible if there exists an \(n \times m\) matrix \(B\) such that \(B A = I\). We say that an \(m \times n\) matrix \(A\) is right-invertible if there exists an \(n \times m\) matrix \(B\) such that \(A B= I\).

We say that a square matrix \(A\) is invertible when there exists another square matrix \(B\) of same size such that \(AB = BA = I\). A square matrix is invertible iff its both left and right invertible. Inverse of a square invertible matrix is denoted by \(A^{-1}\).

A special left or right inverse is the pseudo inverse, which is denoted by \(A^{\dag}\).

Column space of a matrix is denoted by \(\ColSpace(A)\), the null space by \(\NullSpace(A)\), and the row space by \(\RowSpace(A)\).

We say that a matrix is symmetric when \(A = A^T\), conjugate symmetric or Hermitian when \(A^H =A\).

When a square matrix is not invertible, we say that it is singular. A non-singular matrix is invertible.

The eigen values of a square matrix are written as \(\lambda_1, \lambda_2, \dots\) while the singular values of a rectangular matrix are written as \(\sigma_1, \sigma_2, \dots\).

The inner product or dot product of two column / row vectors \(u\) and \(v\) belonging to \(\RR^n\) is defined as

(1)¶\[u \cdot v = \langle u, v \rangle = \sum_{i=1}^n u_i v_i.\]

The inner product or dot product of two column / row vectors \(u\) and \(v\) belonging to \(\CC^n\) is defined as

(2)¶\[u \cdot v = \langle u, v \rangle = \sum_{i=1}^n u_i \overline{v_i}.\]

Block matrix¶

Definition

A block matrix is a matrix whose entries themselves are matrices with following constraints

Entries in every row are matrices with same number of rows.
Entries in every column are matrices with same number of columns.

Let \(A\) be an \(m \times n\) block matrix. Then

\[\begin{split}A = \begin{bmatrix} A_{11} & A_{12} & \dots & A_{1 n}\\ A_{21} & A_{22} & \dots & A_{2 n}\\ \vdots & \vdots & \ddots & \vdots\\ A_{m 1} & A_{m 2} & \dots & A_{m n}\\ \end{bmatrix}\end{split}\]

where \(A_{i j}\) is a matrix with \(r_i\) rows and \(c_j\) columns.

A block matrix is also known as a partitioned matrix.

Example2x2 block matrices

Quite frequently we will be using \(2x2\) block matrices.

\[\begin{split}P = \begin{bmatrix} P_{11} & P_{12} \\ P_{21} & P_{22} \end{bmatrix}.\end{split}\]

An example

\[\begin{split}P = \left[ \begin{array}{c c | c} a & b & c \\ d & e & f \\ \hline g & h & i \end{array} \right]\end{split}\]

We have

\[\begin{split}P_{11} = \begin{bmatrix} a & b \\ d & e \end{bmatrix} \; P_{12} = \begin{bmatrix} c \\ f \end{bmatrix} \; P_{21} = \begin{bmatrix} g & h \end{bmatrix} \; P_{22} = \begin{bmatrix} i \end{bmatrix}\end{split}\]

\(P_{11}\) and \(P_{12}\) have \(2\) rows.
\(P_{21}\) and \(P_{22}\) have \(1\) row.
\(P_{11}\) and \(P_{21}\) have \(2\) columns.
\(P_{12}\) and \(P_{22}\) have \(1\) column.

Lemma

Let \(A = [A_{ij}]\) be an \(m \times n\) block matrix with \(A_{ij}\) being an \(r_i \times c_j\) matrix. Then \(A\) is an \(r \times c\) matrix where

\[r = \sum_{i=1}^m r_i\]

and

\[c = \sum_{j=1}^n c_j.\]

Remark

Sometimes it is convenient to think of a regular matrix as a block matrix whose entries are \(1 \times 1\) matrices themselves.

Definition

Let \(A = [A_{ij}]\) be an \(m \times n\) block matrix with \(A_{ij}\) being a \(p_i \times q_j\) matrix. Let \(B = [B_{jk}]\) be an \(n \times p\) block matrix with \(B_{jk}\) being a \(q_j \times r_k\) matrix. Then the two block matrices are compatible for multiplication and their multiplication is defined by \(C = AB = [C_{i k}]\) where

\[C_{i k} = \sum_{j=1}^n A_{i j} B_{j k}\]

and \(C_{i k}\) is a \(p_i \times r_k\) matrix.

Definition

A block diagonal matrix is a block matrix whose off diagonal entries are zero matrices.