Invertible matrices¶

Definition

A square matrix $A$ is called invertible if there exists another square matrix $B$ of same size such that

\[AB = BA = I.\]

The matrix $B$ is called the inverse of $A$ and is denoted as $A^{-1}$.

Lemma

If $A$ is invertible then its inverse $A^{-1}$ is also invertible and the inverse of $A^{-1}$ is nothing but $A$.

Lemma

Identity matrix $I$ is invertible.

Proof

\[I I = I \implies I^{-1} = I.\]

Lemma

If $A$ is invertible then columns of $A$ are linearly independent.

Proof

Assume $A$ is invertible, then there exists a matrix $B$ such that

\[AB = BA = I.\]

Assume that columns of $A$ are linearly dependent. Then there exists $u \neq 0$ such that

\[A u = 0 \implies BA u = 0 \implies I u = 0 \implies u = 0\]

a contradiction. Hence columns of $A$ are linearly independent.

Lemma

If an $n\times n$ matrix $A$ is invertible then columns of $A$ span $\FF^n$.

Proof

Assume $A$ is invertible, then there exists a matrix $B$ such that

\[AB = BA = I.\]

Now let $x \in \FF^n$ be any arbitrary vector. We need to show that there exists $\alpha \in \FF^n$ such that

\[x = A \alpha.\]

But

\[x = I x = AB x = A ( B x).\]

Thus if we choose $\alpha = Bx$, then

\[x = A \alpha.\]

Thus columns of $A$ span $\FF^n$.

Lemma

If $A$ is invertible, then columns of $A$ form a basis for $\FF^n$.

Proof

In $\FF^n$ a basis is a set of vectors which is linearly independent and spans $\FF^n$. By here and here, columns of an invertible matrix $A$ satisfy both conditions. Hence they form a basis.

Lemma

If $A$ is invertible, then $A^T$ is invertible.

Proof

Assume $A$ is invertible, then there exists a matrix $B$ such that

\[AB = BA = I.\]

Applying transpose on both sides we get

\[B^T A^T = A^T B^T = I.\]

Thus $B^T$ is inverse of $A^T$ and $A^T$ is invertible.

Lemma

If $A$ is invertible than $A^H$ is invertible.

Proof

Assume $A$ is invertible, then there exists a matrix $B$ such that

\[AB = BA = I.\]

Applying conjugate transpose on both sides we get

\[B^H A^H = A^H B^H = I.\]

Thus $B^H$ is inverse of $A^H$ and $A^H$ is invertible.

Lemma

If $A$ and $B$ are invertible then $AB$ is invertible.

Proof

We note that

\[(AB) (B^{-1}A^{-1}) = A (B B^{-1})A^{-1} = A I A^{-1} = I.\]

Similarly

\[(B^{-1}A^{-1}) (AB) = B^{-1} (A^{-1} A ) B = B^{-1} I B = I.\]

Thus $B^{-1}A^{-1}$ is the inverse of $AB$.

Lemma

The set of $n \times n$ invertible matrices under the matrix multiplication operation form a group.

Proof

We verify the properties of a group

[Closure] If $A$ and $B$ are invertible then $AB$ is invertible. Hence the set is closed.
[Associativity] Matrix multiplication is associative.
[Identity element] $I$ is invertible and $AI = IA = A$ for all invertible matrices.
[Inverse element] If $A$ is invertible then $A^{-1}$ is also invertible.

Thus the set of invertible matrices is indeed a group under matrix multiplication.

Lemma

An $n \times n$ matrix $A$ is invertible if and only if it is full rank i.e.

\[\Rank(A) = n.\]

Corollary

The rank of an invertible matrix and its inverse are same.

Similar matrices¶

Definition

An $n \times n$ matrix $B$ is similar to an $n \times n$ matrix $A$ if there exists an $n \times n$ non-singular matrix $C$ such that

\[B = C^{-1} A C.\]

Lemma

If $B$ is similar to $A$ then $A$ is similar to $B$. Thus similarity is a symmetric relation.

Proof

\[B = C^{-1} A C \implies A = C B C^{-1} \implies A = (C^{-1})^{-1} B C^{-1}\]

Thus there exists a matrix $D = C^{-1}$ such that

\[A = D^{-1} B D.\]

Thus $A$ is similar to $B$.

Lemma

Similar matrices have same rank.

Proof

Let $B$ be similar to $A$. Thus their exists an invertible matrix $C$ such that

\[B = C^{-1} A C.\]

Since $C$ is invertible hence we have $\Rank (C) = \Rank(C^{-1}) = n$. Now using here $\Rank (AC) = \Rank (A)$ and using here we have $\Rank(C^{-1} (AC) ) = \Rank (AC) = \Rank(A)$. Thus

\[\Rank(B) = \Rank(A).\]

Lemma

Similarity is an equivalence relation on the set of $n \times n$ matrices.

Proof

Let $A, B, C$ be $n \times n$ matrices. $A$ is similar to itself through an invertible matrix $I$. If $A$ is similar to $B$ then $B$ is similar to $A$. If $B$ is similar to $A$ via $P$ s.t. $B = P^{-1}AP$ and $C$ is similar to $B$ via $Q$ s.t. $C = Q^{-1} B Q$ then $C$ is similar to $A$ via $PQ$ such that $C = (PQ)^{-1} A (P Q)$. Thus similarity is an equivalence relation on the set of square matrices and if $A$ is any $n \times n$ matrix then the set of $n \times n$ matrices similar to $A$ forms an equivalence class.

Gram matrices¶

Definition

Gram matrix of columns of $A$ is given by

\[G = A^H A\]

Definition

Gram matrix of rows of $A$ is given by

\[G = A A^H\]

This is also known as the frame operator of $A$.

Remark

Usually when we talk about Gram matrix of a matrix we are looking at the Gram matrix of its column vectors.

Remark

For real matrix $A \in \RR^{m \times n}$, the Gram matrix of its column vectors is given by $A^T A$ and the Gram matrix for its row vectors is given by $A A^T$.

Following results apply equally well for the real case.

Lemma

The columns of a matrix are linearly dependent if and only if the Gram matrix of its column vectors $A^H A$ is not invertible.

Proof

Let $A$ be an $m\times n$ matrix and $G = A^H A$ be the Gram matrix of its columns.

If columns of $A$ are linearly dependent, then there exists a vector $u \neq 0$ such that

\[A u = 0.\]

Thus

\[G u = A^H A u = 0.\]

Hence the columns of $G$ are also dependent and $G$ is not invertible.

Conversely let us assume that $G$ is not invertible, thus columns of $G$ are dependent and there exists a vector $v \neq 0$ such that

\[G v = 0.\]

Now

\[v^H G v = v^H A^H A v = (A v)^H (A v) = \| A v \|_2^2.\]

From previous equation, we have

\[\| A v \|_2^2 = 0 \implies A v = 0.\]

Since $v \neq 0$ hence columns of $A$ are also linearly dependent.

Corollary

The columns of a matrix are linearly independent if and only if the Gram matrix of its column vectors $A^H A$ is invertible.

Proof

Columns of $A$ can be dependent only if its Gram matrix is not invertible. Thus if the Gram matrix is invertible, then the columns of $A$ are linearly independent.

The Gram matrix is not invertible only if columns of $A$ are linearly dependent. Thus if columns of $A$ are linearly independent then the Gram matrix is invertible.

Corollary

Let $A$ be a full column rank matrix. Then $A^H A$ is invertible.

Lemma

The null space of $A$ and its Gram matrix $A^HA$ coincide. i.e.

\[\NullSpace(A) = \NullSpace(A^H A).\]

Proof

Let $u \in \NullSpace(A)$. Then

\[A u = 0 \implies A^H A u = 0.\]

Thus

\[u \in \NullSpace(A^HA ) \implies \NullSpace(A) \subseteq \NullSpace(A^H A).\]

Now let $u \in \NullSpace(A^H A)$. Then

\[A^H A u = 0 \implies u^H A^H A u = 0 \implies \| A u \|_2^2 = 0 \implies A u = 0.\]

Thus we have

\[u \in \NullSpace(A ) \implies \NullSpace(A^H A) \subseteq \NullSpace(A).\]

Lemma

The rows of a matrix $A$ are linearly dependent if and only if the Gram matrix of its row vectors $AA^H$ is not invertible.

Proof

Rows of $A$ are linearly dependent, if and only if columns of $A^H$ are linearly dependent. There exists a vector $v \neq 0$ s.t.

\[A^H v = 0\]

Thus

\[G v = A A^H v = 0.\]

Since $v \neq 0$ hence $G$ is not invertible.

Converse: assuming that $G$ is not invertible, there exists a vector $u \neq 0$ s.t.

\[G u = 0.\]

Now

\[u^H G u = u^H A A^H u = (A^H u)^H (A^H u) = \| A^H u \|_2^2 = 0 \implies A^H u = 0.\]

Since $u \neq 0$ hence columns of $A^H$ and consequently rows of $A$ are linearly dependent.

Corollary

The rows of a matrix $A$ are linearly independent if and only if the Gram matrix of its row vectors $AA^H$ is invertible.

Corollary

Let $A$ be a full row rank matrix. Then $A A^H$ is invertible.

Pseudo inverses¶

Definition

Let $A$ be an $m \times n$ matrix. An $n\times m$ matrix $A^{\dag}$ is called its Moore-Penrose pseudo-inverse if it satisfies all of the following criteria:

$A A^{\dag} A = A$.
$A^{\dag} A A^{\dag} = A^{\dag}$.
$\left(A A^{\dag} \right)^H = A A^{\dag}$ i.e. $A A^{\dag}$ is Hermitian.
$(A^{\dag} A)^H = A^{\dag} A$ i.e. $A^{\dag} A$ is Hermitian.

Theorem

For any matrix $A$ there exists precisely one matrix $A^{\dag}$ which satisfies all the requirements above.

We omit the proof for this. The pseudo-inverse can actually be obtained by the singular value decomposition of $A$. This is shown here.

Lemma

Let $D = \Diag(d_1, d_2, \dots, d_n)$ be an $n \times n$ diagonal matrix. Then its Moore-Penrose pseudo-inverse is $D^{\dag} = \Diag(c_1, c_2, \dots, c_n)$ where

\[\begin{split}c_i = \left\{ \begin{array}{ll} \frac{1}{d_i} & \mbox{if $d_i \neq 0$};\\ 0 & \mbox{if $d_i = 0$}. \end{array} \right.\end{split}\]

Proof

We note that $D^{\dag} D = D D^{\dag} = F = \Diag(f_1, f_2, \dots f_n)$ where

\[\begin{split}f_i = \left\{ \begin{array}{ll} 1 & \mbox{if $d_i \neq 0$};\\ 0 & \mbox{if $d_i = 0$}. \end{array} \right.\end{split}\]

We now verify the requirements listed here.

\[D D^{\dag} D = F D = D.\]

\[D^{\dag} D D^{\dag} = F D^{\dag} = D^{\dag}\]

$D^{\dag} D = D D^{\dag} = F$ is a diagonal hence Hermitian matrix.

Lemma

Let $D = \Diag(d_1, d_2, \dots, d_p)$ be an $m \times n$ rectangular diagonal matrix where $p = \min(m, n)$. Then its Moore-Penrose pseudo-inverse is an $n \times m$ rectangular diagonal matrix $D^{\dag} = \Diag(c_1, c_2, \dots, c_p)$ where

\[\begin{split}c_i = \left\{ \begin{array}{ll} \frac{1}{d_i} & \mbox{if $d_i \neq 0$};\\ 0 & \mbox{if $d_i = 0$}. \end{array} \right.\end{split}\]

Proof

$F = D^{\dag} D = \Diag(f_1, f_2, \dots f_n)$ is an $n \times n$ matrix where

\[\begin{split}f_i = \left\{ \begin{array}{ll} 1 & \mbox{if $d_i \neq 0$};\\ 0 & \mbox{if $d_i = 0$};\\ 0 & \mbox{if $i > p$}. \end{array} \right.\end{split}\]

$G = D D^{\dag} = \Diag(g_1, g_2, \dots g_n)$ is an $m \times m$ matrix where

\[\begin{split}g_i = \left\{ \begin{array}{ll} 1 & \mbox{if $d_i \neq 0$};\\ 0 & \mbox{if $d_i = 0$};\\ 0 & \mbox{if $i > p$}. \end{array} \right.\end{split}\]

We now verify the requirements listed here.

\[D D^{\dag} D = D F = D.\]

\[D^{\dag} D D^{\dag} = D^{\dag} G = D^{\dag}\]

$F = D^{\dag} D$ and $G = D D^{\dag}$ are both diagonal hence Hermitian matrices.

Lemma

If $A$ is full column rank then its Moore-Penrose pseudo-inverse is given by

\[A^{\dag} = (A^H A)^{-1} A^H.\]

It is a left inverse of $A$.

Proof

By here $A^H A$ is invertible.

First of all we verify that it is a left inverse.

\[A^{\dag} A = (A^H A)^{-1} A^H A = I.\]

We now verify all the properties.

\[A A^{\dag} A = A I = A.\]

\[A^{\dag} A A^{\dag} = I A^{\dag} = A^{\dag}.\]

Hermitian properties:

\[\left(A A^{\dag} \right)^H = \left(A (A^H A)^{-1} A^H \right)^H = \left(A (A^H A)^{-1} A^H \right) = A A^{\dag}.\]

\[(A^{\dag} A)^H = I^H = I = A^{\dag} A.\]

Lemma

If $A$ is full row rank then its Moore-Penrose pseudo-inverse is given by

\[A^{\dag} = A^H (A A^H)^{-1} .\]

It is a right inverse of $A$.

Proof

By here $A A^H$ is invertible.

First of all we verify that it is a right inverse.

\[A A^{\dag} = A A^H (A A^H)^{-1}= I.\]

We now verify all the properties.

\[A A^{\dag} A = I A = A.\]

\[A^{\dag} A A^{\dag} = A^{\dag} I = A^{\dag}.\]

Hermitian properties:

\[\left(A A^{\dag} \right)^H = I^H = I = A A^{\dag}.\]

\[(A^{\dag} A)^H = \left (A^H (A A^H)^{-1} A \right )^H = A^H (A A^H)^{-1} A = A^{\dag} A.\]