Invertible matrices

Definition

A square matrix \(A\) is called invertible if there exists another square matrix \(B\) of same size such that

\[AB = BA = I.\]

The matrix \(B\) is called the inverse of \(A\) and is denoted as \(A^{-1}\).

Lemma
If \(A\) is invertible then its inverse \(A^{-1}\) is also invertible and the inverse of \(A^{-1}\) is nothing but \(A\).
Lemma
Identity matrix \(I\) is invertible.
Proof
\[I I = I \implies I^{-1} = I.\]
Lemma
If \(A\) is invertible then columns of \(A\) are linearly independent.
Proof

Assume \(A\) is invertible, then there exists a matrix \(B\) such that

\[AB = BA = I.\]

Assume that columns of \(A\) are linearly dependent. Then there exists \(u \neq 0\) such that

\[A u = 0 \implies BA u = 0 \implies I u = 0 \implies u = 0\]

a contradiction. Hence columns of \(A\) are linearly independent.

Lemma
If an \(n\times n\) matrix \(A\) is invertible then columns of \(A\) span \(\FF^n\).
Proof

Assume \(A\) is invertible, then there exists a matrix \(B\) such that

\[AB = BA = I.\]

Now let \(x \in \FF^n\) be any arbitrary vector. We need to show that there exists \(\alpha \in \FF^n\) such that

\[x = A \alpha.\]

But

\[x = I x = AB x = A ( B x).\]

Thus if we choose \(\alpha = Bx\), then

\[x = A \alpha.\]

Thus columns of \(A\) span \(\FF^n\).

Lemma
If \(A\) is invertible, then columns of \(A\) form a basis for \(\FF^n\).
Proof
In \(\FF^n\) a basis is a set of vectors which is linearly independent and spans \(\FF^n\). By here and here, columns of an invertible matrix \(A\) satisfy both conditions. Hence they form a basis.
Lemma
If \(A\) is invertible, then \(A^T\) is invertible.
Proof

Assume \(A\) is invertible, then there exists a matrix \(B\) such that

\[AB = BA = I.\]

Applying transpose on both sides we get

\[B^T A^T = A^T B^T = I.\]

Thus \(B^T\) is inverse of \(A^T\) and \(A^T\) is invertible.

Lemma
If \(A\) is invertible than \(A^H\) is invertible.
Proof

Assume \(A\) is invertible, then there exists a matrix \(B\) such that

\[AB = BA = I.\]

Applying conjugate transpose on both sides we get

\[B^H A^H = A^H B^H = I.\]

Thus \(B^H\) is inverse of \(A^H\) and \(A^H\) is invertible.

Lemma
If \(A\) and \(B\) are invertible then \(AB\) is invertible.
Proof

We note that

\[(AB) (B^{-1}A^{-1}) = A (B B^{-1})A^{-1} = A I A^{-1} = I.\]

Similarly

\[(B^{-1}A^{-1}) (AB) = B^{-1} (A^{-1} A ) B = B^{-1} I B = I.\]

Thus \(B^{-1}A^{-1}\) is the inverse of \(AB\).

Lemma
The set of \(n \times n\) invertible matrices under the matrix multiplication operation form a group.
Proof

We verify the properties of a group

  • [Closure] If \(A\) and \(B\) are invertible then \(AB\) is invertible. Hence the set is closed.
  • [Associativity] Matrix multiplication is associative.
  • [Identity element] \(I\) is invertible and \(AI = IA = A\) for all invertible matrices.
  • [Inverse element] If \(A\) is invertible then \(A^{-1}\) is also invertible.

Thus the set of invertible matrices is indeed a group under matrix multiplication.

Lemma

An \(n \times n\) matrix \(A\) is invertible if and only if it is full rank i.e.

\[\Rank(A) = n.\]
Corollary
The rank of an invertible matrix and its inverse are same.

Similar matrices

Definition

An \(n \times n\) matrix \(B\) is similar to an \(n \times n\) matrix \(A\) if there exists an \(n \times n\) non-singular matrix \(C\) such that

\[B = C^{-1} A C.\]
Lemma
If \(B\) is similar to \(A\) then \(A\) is similar to \(B\). Thus similarity is a symmetric relation.
Proof
\[B = C^{-1} A C \implies A = C B C^{-1} \implies A = (C^{-1})^{-1} B C^{-1}\]

Thus there exists a matrix \(D = C^{-1}\) such that

\[A = D^{-1} B D.\]

Thus \(A\) is similar to \(B\).

Lemma
Similar matrices have same rank.
Proof

Let \(B\) be similar to \(A\). Thus their exists an invertible matrix \(C\) such that

\[B = C^{-1} A C.\]

Since \(C\) is invertible hence we have \(\Rank (C) = \Rank(C^{-1}) = n\). Now using here \(\Rank (AC) = \Rank (A)\) and using here we have \(\Rank(C^{-1} (AC) ) = \Rank (AC) = \Rank(A)\). Thus

\[\Rank(B) = \Rank(A).\]
Lemma
Similarity is an equivalence relation on the set of \(n \times n\) matrices.
Proof
Let \(A, B, C\) be \(n \times n\) matrices. \(A\) is similar to itself through an invertible matrix \(I\). If \(A\) is similar to \(B\) then \(B\) is similar to \(A\). If \(B\) is similar to \(A\) via \(P\) s.t. \(B = P^{-1}AP\) and \(C\) is similar to \(B\) via \(Q\) s.t. \(C = Q^{-1} B Q\) then \(C\) is similar to \(A\) via \(PQ\) such that \(C = (PQ)^{-1} A (P Q)\). Thus similarity is an equivalence relation on the set of square matrices and if \(A\) is any \(n \times n\) matrix then the set of \(n \times n\) matrices similar to \(A\) forms an equivalence class.

Gram matrices

Definition

Gram matrix of columns of \(A\) is given by

\[G = A^H A\]
Definition

Gram matrix of rows of \(A\) is given by

\[G = A A^H\]

This is also known as the frame operator of \(A\).

Remark
Usually when we talk about Gram matrix of a matrix we are looking at the Gram matrix of its column vectors.
Remark
For real matrix \(A \in \RR^{m \times n}\), the Gram matrix of its column vectors is given by \(A^T A\) and the Gram matrix for its row vectors is given by \(A A^T\).

Following results apply equally well for the real case.

Lemma
The columns of a matrix are linearly dependent if and only if the Gram matrix of its column vectors \(A^H A\) is not invertible.
Proof

Let \(A\) be an \(m\times n\) matrix and \(G = A^H A\) be the Gram matrix of its columns.

If columns of \(A\) are linearly dependent, then there exists a vector \(u \neq 0\) such that

\[A u = 0.\]

Thus

\[G u = A^H A u = 0.\]

Hence the columns of \(G\) are also dependent and \(G\) is not invertible.

Conversely let us assume that \(G\) is not invertible, thus columns of \(G\) are dependent and there exists a vector \(v \neq 0\) such that

\[G v = 0.\]

Now

\[v^H G v = v^H A^H A v = (A v)^H (A v) = \| A v \|_2^2.\]

From previous equation, we have

\[\| A v \|_2^2 = 0 \implies A v = 0.\]

Since \(v \neq 0\) hence columns of \(A\) are also linearly dependent.

Corollary
The columns of a matrix are linearly independent if and only if the Gram matrix of its column vectors \(A^H A\) is invertible.
Proof

Columns of \(A\) can be dependent only if its Gram matrix is not invertible. Thus if the Gram matrix is invertible, then the columns of \(A\) are linearly independent.

The Gram matrix is not invertible only if columns of \(A\) are linearly dependent. Thus if columns of \(A\) are linearly independent then the Gram matrix is invertible.

Corollary
Let \(A\) be a full column rank matrix. Then \(A^H A\) is invertible.
Lemma

The null space of \(A\) and its Gram matrix \(A^HA\) coincide. i.e.

\[\NullSpace(A) = \NullSpace(A^H A).\]
Proof

Let \(u \in \NullSpace(A)\). Then

\[A u = 0 \implies A^H A u = 0.\]

Thus

\[u \in \NullSpace(A^HA ) \implies \NullSpace(A) \subseteq \NullSpace(A^H A).\]

Now let \(u \in \NullSpace(A^H A)\). Then

\[A^H A u = 0 \implies u^H A^H A u = 0 \implies \| A u \|_2^2 = 0 \implies A u = 0.\]

Thus we have

\[u \in \NullSpace(A ) \implies \NullSpace(A^H A) \subseteq \NullSpace(A).\]
Lemma
The rows of a matrix \(A\) are linearly dependent if and only if the Gram matrix of its row vectors \(AA^H\) is not invertible.
Proof

Rows of \(A\) are linearly dependent, if and only if columns of \(A^H\) are linearly dependent. There exists a vector \(v \neq 0\) s.t.

\[A^H v = 0\]

Thus

\[G v = A A^H v = 0.\]

Since \(v \neq 0\) hence \(G\) is not invertible.

Converse: assuming that \(G\) is not invertible, there exists a vector \(u \neq 0\) s.t.

\[G u = 0.\]

Now

\[u^H G u = u^H A A^H u = (A^H u)^H (A^H u) = \| A^H u \|_2^2 = 0 \implies A^H u = 0.\]

Since \(u \neq 0\) hence columns of \(A^H\) and consequently rows of \(A\) are linearly dependent.

Corollary
The rows of a matrix \(A\) are linearly independent if and only if the Gram matrix of its row vectors \(AA^H\) is invertible.
Corollary
Let \(A\) be a full row rank matrix. Then \(A A^H\) is invertible.

Pseudo inverses

Definition

Let \(A\) be an \(m \times n\) matrix. An \(n\times m\) matrix \(A^{\dag}\) is called its Moore-Penrose pseudo-inverse if it satisfies all of the following criteria:

  • \(A A^{\dag} A = A\).
  • \(A^{\dag} A A^{\dag} = A^{\dag}\).
  • \(\left(A A^{\dag} \right)^H = A A^{\dag}\) i.e. \(A A^{\dag}\) is Hermitian.
  • \((A^{\dag} A)^H = A^{\dag} A\) i.e. \(A^{\dag} A\) is Hermitian.
Theorem
For any matrix \(A\) there exists precisely one matrix \(A^{\dag}\) which satisfies all the requirements above.

We omit the proof for this. The pseudo-inverse can actually be obtained by the singular value decomposition of \(A\). This is shown here.

Lemma

Let \(D = \Diag(d_1, d_2, \dots, d_n)\) be an \(n \times n\) diagonal matrix. Then its Moore-Penrose pseudo-inverse is \(D^{\dag} = \Diag(c_1, c_2, \dots, c_n)\) where

\[\begin{split}c_i = \left\{ \begin{array}{ll} \frac{1}{d_i} & \mbox{if $d_i \neq 0$};\\ 0 & \mbox{if $d_i = 0$}. \end{array} \right.\end{split}\]
Proof

We note that \(D^{\dag} D = D D^{\dag} = F = \Diag(f_1, f_2, \dots f_n)\) where

\[\begin{split}f_i = \left\{ \begin{array}{ll} 1 & \mbox{if $d_i \neq 0$};\\ 0 & \mbox{if $d_i = 0$}. \end{array} \right.\end{split}\]

We now verify the requirements listed here.

\[D D^{\dag} D = F D = D.\]
\[D^{\dag} D D^{\dag} = F D^{\dag} = D^{\dag}\]

\(D^{\dag} D = D D^{\dag} = F\) is a diagonal hence Hermitian matrix.

Lemma

Let \(D = \Diag(d_1, d_2, \dots, d_p)\) be an \(m \times n\) rectangular diagonal matrix where \(p = \min(m, n)\). Then its Moore-Penrose pseudo-inverse is an \(n \times m\) rectangular diagonal matrix \(D^{\dag} = \Diag(c_1, c_2, \dots, c_p)\) where

\[\begin{split}c_i = \left\{ \begin{array}{ll} \frac{1}{d_i} & \mbox{if $d_i \neq 0$};\\ 0 & \mbox{if $d_i = 0$}. \end{array} \right.\end{split}\]
Proof

\(F = D^{\dag} D = \Diag(f_1, f_2, \dots f_n)\) is an \(n \times n\) matrix where

\[\begin{split}f_i = \left\{ \begin{array}{ll} 1 & \mbox{if $d_i \neq 0$};\\ 0 & \mbox{if $d_i = 0$};\\ 0 & \mbox{if $i > p$}. \end{array} \right.\end{split}\]

\(G = D D^{\dag} = \Diag(g_1, g_2, \dots g_n)\) is an \(m \times m\) matrix where

\[\begin{split}g_i = \left\{ \begin{array}{ll} 1 & \mbox{if $d_i \neq 0$};\\ 0 & \mbox{if $d_i = 0$};\\ 0 & \mbox{if $i > p$}. \end{array} \right.\end{split}\]

We now verify the requirements listed here.

\[D D^{\dag} D = D F = D.\]
\[D^{\dag} D D^{\dag} = D^{\dag} G = D^{\dag}\]

\(F = D^{\dag} D\) and \(G = D D^{\dag}\) are both diagonal hence Hermitian matrices.

Lemma

If \(A\) is full column rank then its Moore-Penrose pseudo-inverse is given by

\[A^{\dag} = (A^H A)^{-1} A^H.\]

It is a left inverse of \(A\).

Proof

By here \(A^H A\) is invertible.

First of all we verify that it is a left inverse.

\[A^{\dag} A = (A^H A)^{-1} A^H A = I.\]

We now verify all the properties.

\[A A^{\dag} A = A I = A.\]
\[A^{\dag} A A^{\dag} = I A^{\dag} = A^{\dag}.\]

Hermitian properties:

\[\left(A A^{\dag} \right)^H = \left(A (A^H A)^{-1} A^H \right)^H = \left(A (A^H A)^{-1} A^H \right) = A A^{\dag}.\]
\[(A^{\dag} A)^H = I^H = I = A^{\dag} A.\]
Lemma

If \(A\) is full row rank then its Moore-Penrose pseudo-inverse is given by

\[A^{\dag} = A^H (A A^H)^{-1} .\]

It is a right inverse of \(A\).

Proof

By here \(A A^H\) is invertible.

First of all we verify that it is a right inverse.

\[A A^{\dag} = A A^H (A A^H)^{-1}= I.\]

We now verify all the properties.

\[A A^{\dag} A = I A = A.\]
\[A^{\dag} A A^{\dag} = A^{\dag} I = A^{\dag}.\]

Hermitian properties:

\[\left(A A^{\dag} \right)^H = I^H = I = A A^{\dag}.\]
\[(A^{\dag} A)^H = \left (A^H (A A^H)^{-1} A \right )^H = A^H (A A^H)^{-1} A = A^{\dag} A.\]