Singular values

In previous section we saw diagonalization of square matrices which resulted in an eigen value decomposition of the matrix. This matrix factorization is very useful yet it is not applicable in all situations. In particular, the eigen value decomposition is useless if the square matrix is not diagonalizable or if the matrix is not square at all. Moreover, the decomposition is particularly useful only for real symmetric or Hermitian matrices where the diagonalizing matrix is an \(\FF\)-unitary matrix (see here). Otherwise, one has to consider the inverse of the diagonalizing matrix also.

Fortunately there happens to be another decomposition which applies to all matrices and it involves just \(\FF\)-unitary matrices.

Definition

A non-negative real number \(\sigma\) is a singular value for a matrix \(A \in \FF^{m \times n}\) if and only if there exist unit-length vectors \(u \in \FF^m\) and \(v \in \FF^n\) such that

\[A v = \sigma u\]

and

\[A^H u = \sigma v\]

hold. The vectors \(u\) and \(v\) are called left-singular and right-singular vectors for \(\sigma\) respectively.

We first present the basic result of singular value decomposition. We will not prove this result completely although we will present proofs of some aspects.

Theorem

For every \(A \in \FF^{m \times n}\) with \(k = \min(m , n)\), there exist two \(\FF\)-unitary matrices \(U \in \FF^{m \times m}\) and \(V \in \FF^{n \times n}\) and a sequence of real numbers

\[\sigma_1 \geq \sigma_2 \geq \dots \geq \sigma_k \geq 0\]

such that

(1)\[U^H A V = \Sigma\]

where

\[\Sigma = \Diag(\sigma_1, \sigma_2, \dots, \sigma_k) \in \FF^{ m \times n}.\]

The non-negative real numbers \(\sigma_i\) are the singular values of \(A\) as per here.

The sequence of real numbers \(\sigma_i\) doesn’t depend on the particular choice of \(U\) and \(V\).

\(\Sigma\) is rectangular with the same size as \(A\). The singular values of \(A\) lie on the principle diagonal of \(\Sigma\). All other entries in \(\Sigma\) are zero.

It is certainly possible that some of the singular values are 0 themselves.

Remark

Since \(U^H A V = \Sigma\) hence

(2)\[A = U \Sigma V^H.\]
Definition

The decomposition of a matrix \(A \in \FF^{m \times n}\) given by

(3)\[A = U \Sigma V^H\]

is known as its singular value decomposition.

Remark

When \(\FF\) is \(\RR\) then the decomposition simplifies to

(4)\[U^T A V = \Sigma\]

and

\[A = U \Sigma V^T.\]
Remark
Clearly there can be at most \(k= \min(m , n)\) distinct singular values of \(A\).
Remark

We can also write

(5)\[A V = U \Sigma.\]
Remark

Let us expand

\[\begin{split}A = U \Sigma V^H = \begin{bmatrix} u_1 & u_2 & \dots & u_m \end{bmatrix} \begin{bmatrix} \sigma_{ij} \end{bmatrix} \begin{bmatrix} v_1^H \\ v_2^H \\ \vdots \\ v_n^H \end{bmatrix} = \sum_{i=1}^m \sum_{j=1}^n \sigma_{ij} u_i v_j^H.\end{split}\]
Remark

Alternatively, let us expand

\[\begin{split}\Sigma = U^H AV = \begin{bmatrix} u_1^H \\ u_2^H \\ \vdots \\ u_m^H \end{bmatrix} A \begin{bmatrix} v_1 & v_2 & \dots & v_m \end{bmatrix} = \begin{bmatrix} u_i^H A v_j \end{bmatrix}\end{split}\]

This gives us

\[\sigma_{i j} = u_i^H A v_j.\]

Following lemma verifies that \(\Sigma\) indeed consists of singular values of \(A\) as per here.

Lemma
Let \(A = U \Sigma V^H\) be a singular value decomposition of \(A\). Then the main diagonal entries of \(\Sigma\) are singular values. The first \(k = \min(m, n)\) column vectors in \(U\) and \(V\) are left and right singular vectors of \(A\).
Proof

We have

\[AV = U \Sigma.\]

Let us expand R.H.S.

\[U \Sigma = \begin{bmatrix}\sum_{j=1}^m u_{i j} \sigma_{j k} \end{bmatrix} = [u_{i k} \sigma_k] = \begin{bmatrix} \sigma_1 u_1 & \sigma_2 u_2 & \dots \sigma_k u_k & 0 & \dots & 0 \end{bmatrix}\]

where \(0\) columns in the end appear \(n - k\) times.

Expanding the L.H.S. we get

\[A V = \begin{bmatrix} A v_1 & A v_2 & \dots & A v_n \end{bmatrix}.\]

Thus by comparing both sides we get

\[A v_i = \sigma_i u_i \; \text{ for } \; 1 \leq i \leq k\]

and

\[A v_i = 0 \text{ for } k < i \leq n.\]

Now let us start with

\[A = U \Sigma V^H \implies A^H = V \Sigma^H U^H \implies A^H U = V \Sigma^H.\]

Let us expand R.H.S.

\[V \Sigma^H = \begin{bmatrix}\sum_{j=1}^n v_{i j} \sigma_{j k} \end{bmatrix} = [v_{i k} \sigma_k] = \begin{bmatrix} \sigma_1 v_1 & \sigma_2 v_2 & \dots \sigma_k v_k & 0 & \dots & 0 \end{bmatrix}\]

where \(0\) columns appear \(m - k\) times.

Expanding the L.H.S. we get

\[ A^H U = \begin{bmatrix} A^H u_1 & A^H u_2 & \dots & A^H u_m \end{bmatrix}.\]

Thus by comparing both sides we get

\[A^H u_i = \sigma_i v_i \; \text{ for } \; 1 \leq i \leq k\]

and

\[A^H u_i = 0 \text{ for } k < i \leq m.\]

We now consider the three cases.

For \(m = n\), we have \(k = m =n\). And we get

\[A v_i = \sigma_i u_i, A^H u_i = \sigma_i v_i \; \text{ for } \; 1 \leq i \leq m\]

Thus \(\sigma_i\) is a singular value of \(A\) and \(u_i\) is a left singular vector while \(v_i\) is a right singular vector.

For \(m < n\), we have \(k = m\). We get for first \(m\) vectors in \(V\)

\[A v_i = \sigma_i u_i, A^H u_i = \sigma_i v_i \; \text{ for } \; 1 \leq i \leq m.\]

Finally for remaining \(n-m\) vectors in \(V\), we can write

\[A v_i = 0.\]

They belong to the null space of \(A\).

For \(m > n\), we have \(k = n\). We get for first \(n\) vectors in \(U\)

\[A v_i = \sigma_i u_i, A^H u_i = \sigma_i v_i \; \text{ for } \; 1 \leq i \leq n.\]

Finally for remaining \(m - n\) vectors in \(U\), we can write

\[A^H u_i = 0.\]
Lemma

\(\Sigma \Sigma^H\) is an \(m \times m\) matrix given by

\[\Sigma \Sigma^H = \Diag(\sigma_1^2, \sigma_2^2, \dots \sigma_k^{2}, 0, 0,\dots 0)\]

where the number of \(0\)‘s following \(\sigma_k^{2}\) is \(m - k\).

Lemma

\(\Sigma^H \Sigma\) is an \(n \times n\) matrix given by

\[\Sigma^H \Sigma = \Diag(\sigma_1^2, \sigma_2^2, \dots \sigma_k^{2}, 0, 0,\dots 0)\]

where the number of \(0\)‘s following \(\sigma_k^{2}\) is \(n - k\).

Lemma

Let \(A \in \FF^{m \times n}\) have a singular value decomposition given by

\[A = U \Sigma V^H.\]

Then

\[\Rank(A) = \Rank(\Sigma).\]

In other words, rank of \(A\) is number of non-zero singular values of \(A\). Since the singular values are ordered in descending order in \(A\) hence, the first \(r\) singular values \(\sigma_1, \dots, \sigma_r\) are non-zero.

Proof
This is a straight forward application of here and here. Further since only non-zero values in \(\Sigma\) appear on its main diagonal hence its rank is number of non-zero singular values \(\sigma_i\).
Corollary

Let \(r = \Rank(A)\). Then \(\Sigma\) can be split as a block matrix

\[\begin{split}\Sigma = \left [ \begin{array}{c | c} \Sigma_r & 0\\ \hline 0 & 0 \end{array} \right ]\end{split}\]

where \(\Sigma_r\) is an \(r \times r\) diagonal matrix of the non-zero singular values \(\Diag(\sigma_1, \sigma_2, \dots, \sigma_r)\). All other sub-matrices in \(\Sigma\) are 0.

Lemma
The eigen values of Hermitian matrix \(A^H A \in \FF^{n \times n}\) are \(\sigma_1^2, \sigma_2^2, \dots \sigma_k^{2}, 0, 0,\dots 0\) with \(n - k\) \(0\)‘s after \(\sigma_k^{2}\). Moreover the eigen vectors are the columns of \(V\).
Proof
\[A^H A = \left ( U \Sigma V^H \right)^H U \Sigma V^H = V \Sigma^H U^H U \Sigma V^H = V \Sigma^H \Sigma V^H.\]

We note that \(A^H A\) is Hermitian. Hence \(A^HA\) is diagonalized by \(V\) and the diagonalization of \(A^H A\) is \(\Sigma^H \Sigma\). Thus the eigen values of \(A^H A\) are \(\sigma_1^2, \sigma_2^2, \dots \sigma_k^{2}, 0, 0,\dots 0\) with \(n - k\) \(0\)‘s after \(\sigma_k^{2}\).

Clearly

\[(A^H A) V = V (\Sigma^H \Sigma)\]

thus columns of \(V\) are the eigen vectors of \(A^H A\).

Lemma
The eigen values of Hermitian matrix \(A A^H \in \FF^{m \times m}\) are \(\sigma_1^2, \sigma_2^2, \dots \sigma_k^{2}, 0, 0,\dots 0\) with \(m - k\) \(0\)‘s after \(\sigma_k^{2}\). Moreover the eigen vectors are the columns of \(V\).
Proof
\[A A^H = U \Sigma V^H \left ( U \Sigma V^H \right)^H = U \Sigma V^H V \Sigma^H U^H = U \Sigma \Sigma^H U^H.\]

We note that \(A^H A\) is Hermitian. Hence \(A^HA\) is diagonalized by \(V\) and the diagonalization of \(A^H A\) is \(\Sigma^H \Sigma\). Thus the eigen values of \(A^H A\) are \(\sigma_1^2, \sigma_2^2, \dots \sigma_k^{2}, 0, 0,\dots 0\) with \(m - k\) \(0\)‘s after \(\sigma_k^{2}\).

Clearly

\[(A A^H) U = U (\Sigma \Sigma^H)\]

thus columns of \(U\) are the eigen vectors of \(A A^H\).

Lemma
The Gram matrices \(A A^H\) and \(A^H A\) share the same eigen values except for some extra zeros. Their eigen values are the squares of singular values of \(A\) and some extra zeros. In other words singular values of \(A\) are the square roots of non-zero eigen values of the Gram matrices \(A A^H\) or \(A^H A\).

The largest singular value

Lemma

For all \(u \in \FF^n\) the following holds

\[\| \Sigma u \|_2 \leq \sigma_1 \| u \|_2\]

Moreover for all \(u \in \FF^m\) the following holds

\[\| \Sigma^H u \|_2 \leq \sigma_1 \| u \|_2\]
Proof

Let us expand the term \(\Sigma u\).

\[\begin{split}\begin{bmatrix} \sigma_1 & 0 & \dots & \dots & 0 \\ 0 & \sigma_2 & \dots & \dots & 0 \\ \vdots & \vdots & \ddots & \dots & 0\\ 0 & \vdots & \sigma_k & \dots & 0 \\ 0 & 0 & \vdots & \dots & 0 \end{bmatrix} \begin{bmatrix} u_1 \\ u_2 \\ \vdots \\ u_k \\ \vdots \\ u_n \end{bmatrix} = \begin{bmatrix} \sigma_1 u_1 \\ \sigma_2 u_2 \\ \vdots \\ \sigma_k u_k \\ 0 \\ \vdots \\ 0 \end{bmatrix}\end{split}\]

Now since \(\sigma_1\) is the largest singular value, hence

\[|\sigma_r u_i| \leq |\sigma_1 u_i| \Forall 1 \leq i \leq k.\]

Thus

\[\sum_{i=1}^n |\sigma_1 u_i|^2 \geq \sum_{i=1}^n |\sigma_i u_i|^2\]

or

\[\sigma_1^2 \| u \|_2^2 \geq \| \Sigma u \|_2^2.\]

The result follows.

A simpler representation of \(\Sigma u\) can be given using here. Let \(r = \Rank(A)\). Thus

\[\begin{split}\Sigma = \left [ \begin{array}{c | c} \Sigma_r & 0\\ \hline 0 & 0 \end{array} \right ]\end{split}\]

We split entries in \(u\) as \(u = [(u_1, \dots, u_r )( u_{r + 1} \dots u_n)]^T\). Then

\[\begin{split}\Sigma u = \left [ \begin{array}{c} \Sigma_r \begin{bmatrix} u_1 & \dots& u_r \end{bmatrix}^T\\ 0 \begin{bmatrix} u_{r + 1} & \dots& u_n \end{bmatrix}^T \end{array} \right ] = \begin{bmatrix} \sigma_1 u_1 & \sigma_2 u_2 & \dots & \sigma_r u_r & 0 & \dots & 0 \end{bmatrix}^T\end{split}\]

Thus

\[\| \Sigma u \|_2^2 = \sum_{i=1}^r |\sigma_i u_i |^2 \leq \sigma_1 \sum_{i=1}^r |u_i |^2 \leq \sigma_1 \|u\|_2^2.\]

2nd result can also be proven similarly.

Lemma

Let \(\sigma_1\) be the largest singular value of an \(m \times n\) matrix \(A\). Then

\[\| A x \|_2 \leq \sigma_1 \| x \|_2 \Forall x \in \FF^n.\]

Moreover

\[\| A^H x \|_2 \leq \sigma_1 \| x \|_2 \Forall x \in \FF^m.\]
Proof
\[\| A x \|_2 = \| U \Sigma V^H x \|_2 = \| \Sigma V^H x \|_2\]

since \(U\) is unitary. Now from previous lemma we have

\[\| \Sigma V^H x \|_2 \leq \sigma_1 \| V^H x \|_2 = \sigma_1 \| x \|_2\]

since \(V^H\) also unitary. Thus we get the result

\[\| A x \|_2 \leq \sigma_1 \| x \|_2 \Forall x \in \FF^n.\]

Similarly

\[\| A^H x \|_2 = \| V \Sigma^H U^H x \|_2 = \| \Sigma^H U^H x \|_2\]

since \(V\) is unitary. Now from previous lemma we have

\[\| \Sigma^H U^H x \|_2 \leq \sigma_1 \| U^H x \|_2 = \sigma_1 \| x \|_2\]

since \(U^H\) also unitary. Thus we get the result

\[\| A^H x \|_2 \leq \sigma_1 \| x \|_2 \Forall x \in \FF^m.\]

There is a direct connection between the largest singular value and \(2\)-norm of a matrix (see here).

Corollary

The largest singular value of \(A\) is nothing but its \(2\)-norm. i.e.

\[\sigma_1 = \underset{\|u \|_2 = 1}{\max} \| A u \|_2.\]

SVD and pseudo inverse

Lemma

Let \(A = U \Sigma V^H\) and let \(r = \Rank (A)\). Let \(\sigma_1, \dots, \sigma_r\) be the \(r\) non-zero singular values of \(A\). Then the Moore-Penrose pseudo-inverse of \(\Sigma\) is an \(n \times m\) matrix \(\Sigma^{\dag}\) given by

\[\begin{split}\Sigma^{\dag} = \left [ \begin{array}{c | c} \Sigma_r^{-1} & 0\\ \hline 0 & 0 \end{array} \right ]\end{split}\]

where \(\Sigma_r = \Diag(\sigma_1, \dots, \sigma_r)\).

Essentially \(\Sigma^{\dag}\) is obtained by transposing \(\Sigma\) and inverting all its non-zero (positive real) values.

Proof
Straight forward application of here.
Corollary

The rank of \(\Sigma\) and its pseudo-inverse \(\Sigma^{\dag}\) are same. i.e.

\[\Rank (\Sigma) = \Rank(\Sigma^{\dag}).\]
Proof
The number of non-zero diagonal entries in \(\Sigma\) and \(\Sigma^{\dag}\) are same.
Lemma

Let \(A\) be an \(m \times n\) matrix and let \(A = U \Sigma V^H\) be its singular value decomposition. Let \(\Sigma^{\dag}\) be the pseudo inverse of \(\Sigma\) as per here. Then the Moore-Penrose pseudo-inverse of \(A\) is given by

\[A^{\dag} = V \Sigma^{\dag} U^H.\]
Proof

As usual we verify the requirements for a Moore-Penrose pseudo-inverse as per here. We note that since \(\Sigma^{\dag}\) is the pseudo-inverse of \(\Sigma\) it already satisfies necessary criteria.

First requirement:

\[A A^{\dag} A = U \Sigma V^H V \Sigma^{\dag} U^H U \Sigma V^H = U \Sigma \Sigma^{\dag} \Sigma V^H = U \Sigma V^H = A.\]

Second requirement:

\[A^{\dag} A A^{\dag} = V \Sigma^{\dag} U^H U \Sigma V^H V \Sigma^{\dag} U^H = V \Sigma^{\dag} \Sigma \Sigma^{\dag} U^H = V \Sigma^{\dag} U^H = A^{\dag}.\]

We now consider

\[A A^{\dag} = U \Sigma V^H V \Sigma^{\dag} U^H = U \Sigma \Sigma^{\dag} U^H.\]

Thus

\[\left ( A A^{\dag} \right )^H = \left ( U \Sigma \Sigma^{\dag} U^H \right )^H = U \left ( \Sigma \Sigma^{\dag} \right )^H U^H = U \Sigma \Sigma^{\dag} U^H = A A^{\dag}\]

since \(\Sigma \Sigma^{\dag}\) is Hermitian.

Finally we consider

\[A^{\dag} A = V \Sigma^{\dag} U^H U \Sigma V^H = V \Sigma^{\dag} \Sigma V^H.\]

Thus

\[\left ( A^{\dag} A \right )^H = \left ( V \Sigma^{\dag} \Sigma V^H\right )^H = V \left ( \Sigma^{\dag} \Sigma \right )^H V^H = V \Sigma^{\dag} \Sigma V^H = A^{\dag} A\]

since \(\Sigma^{\dag} \Sigma\) is also Hermitian.

This completes the proof.

Finally we can connect the singular values of \(A\) with the singular values of its pseudo-inverse.

Corollary

The rank of any \(m \times n\) matrix \(A\) and its pseudo-inverse \(A^{\dag}\) are same. i.e.

\[\Rank (A) = \Rank(A^{\dag}).\]
Proof
We have \(\Rank(A) = \Rank(\Sigma)\). Also its easy to verify that \(\Rank(A^{\dag}) = \Rank(\Sigma^{\dag})\). So using here completes the proof.
Lemma

Let \(A\) be an \(m \times n\) matrix and let \(A^{\dag}\) be its \(n \times m\) pseudo inverse as per here. Let \(r = \Rank(A)\) Let \(k = \min(m, n)\) denote the number of singular values while \(r\) denote the number of non-singular values of \(A\). Let \(\sigma_1, \dots, \sigma_r\) be the non-zero singular values of \(A\). Then the number of singular values of \(A^{\dag}\) is same as that of \(A\) and the non-zero singular values of \(A^{\dag}\) are

\[\frac{1}{\sigma_1} , \dots, \frac{1}{\sigma_r}\]

while all other \(k - r\) singular values of \(A^{\dag}\) are zero.

Proof

\(k= \min(m, n)\) denotes the number of singular values for both \(A\) and \(A^{\dag}\). Since rank of \(A\) and \(A^{\dag}\) are same, hence the number of non-zero singular values is same. Now look at

\[A^{\dag} = V \Sigma^{\dag} U^H\]

where

\[\begin{split}\Sigma^{\dag} = \left [ \begin{array}{c | c} \Sigma_r^{-1} & 0\\ \hline 0 & 0 \end{array} \right ].\end{split}\]

Clearly \(\Sigma_r^{-1} = \Diag(\frac{1}{\sigma_1} , \dots, \frac{1}{\sigma_r})\).

Thus expanding the R.H.S. we can get

\[A^{\dag} = \sum_{i=1}^r \frac{1}{\sigma_{i}} v_i u_i^H\]

where \(v_i\) and \(u_i\) are first \(r\) columns of \(V\) and \(U\) respectively. If we reverse the order of first \(r\) columns of \(U\) and \(V\) and reverse the first \(r\) diagonal entries of \(\Sigma^{\dag}\) , the R.H.S. remains the same while we are able to express \(A^{\dag}\) in the standard singular value decomposition form. Thus \(\frac{1}{\sigma_1} , \dots, \frac{1}{\sigma_r}\) are indeed the non-zero singular values of \(A^{\dag}\).

Full column rank matrices

In this subsection we consider some specific results related to singular value decomposition of a full column rank matrix.

We will consider \(A\) to be an \(m \times n\) matrix in \(\FF^{m \times n}\) with \(m \geq n\) and \(\Rank(A) = n\). Let \(A = U \Sigma V^H\) be its singular value decomposition. From here we observe that there are \(n\) non-zero singular values of \(A\). We will call these singular values as \(\sigma_1, \sigma_2, \dots, \sigma_n\). We will define

\[\Sigma_n = \Diag(\sigma_1, \sigma_2, \dots, \sigma_n).\]

Clearly \(\Sigma\) is an \(2\times 1\) block matrix given by

\[\begin{split}\Sigma = \left [ \begin{array}{c} \Sigma_n\\ \hline 0 \end{array} \right ]\end{split}\]

where the lower \(0\) is an \((m - n) \times n\) zero matrix. From here we obtain that \(\Sigma^H \Sigma\) is an \(n \times n\) matrix given by

\[\Sigma^H \Sigma = \Sigma_n^2\]

where

\[\Sigma_n^2 = \Diag(\sigma_1^2, \sigma_2^2, \dots, \sigma_n^2).\]
Lemma
Let \(A\) be a full column rank matrix with singular value decomposition \(A = U \Sigma V^H\). Then \(\Sigma^H \Sigma = \Sigma_n^2 = \Diag(\sigma_1^2, \sigma_2^2, \dots, \sigma_n^2)\) and \(\Sigma^H \Sigma\) is invertible.
Proof

Since all singular values are non-zero hence \(\Sigma_n^2\) is invertible. Thus

\[\left (\Sigma^H \Sigma \right )^{-1} = \left ( \Sigma_n^2 \right )^{-1} = \Diag\left(\frac{1}{\sigma_1^2}, \frac{1}{\sigma_2^2}, \dots, \frac{1}{\sigma_n^2} \right).\]
Lemma

Let \(A\) be a full column rank matrix with singular value decomposition \(A = U \Sigma V^H\). Let \(\sigma_1\) be its largest singular value and \(\sigma_n\) be its smallest singular value. Then

\[\sigma_n^2 \|x \|_2 \leq \| \Sigma^H \Sigma x \|_2 \leq \sigma_1^2 \|x \|_2 \Forall x \in \FF^n.\]
Proof

Let \(x \in \FF^n\). We have

\[\| \Sigma^H \Sigma x \|_2^2 = \| \Sigma_n^2 x \|_2^2 = \sum_{i=1}^n |\sigma_i^2 x_i|^2.\]

Now since

\[\sigma_n \leq \sigma_i \leq \sigma_1\]

hence

\[\sigma_n^4 \sum_{i=1}^n |x_i|^2 \leq \sum_{i=1}^n |\sigma_i^2 x_i|^2 \leq \sigma_1^4 \sum_{i=1}^n |x_i|^2\]

thus

\[\sigma_n^4 \| x \|_2^2 \leq \| \Sigma^H \Sigma x \|_2^2 \leq \sigma_1^4 \| x \|_2^2.\]

Applying square roots, we get

\[\sigma_n^2 \| x \|_2 \leq \| \Sigma^H \Sigma x \|_2 \leq \sigma_1^2 \| x \|_2 \Forall x \in \FF^n.\]

We recall from here that the Gram matrix of its column vectors \(G = A^HA\) is full rank and invertible.

Lemma

Let \(A\) be a full column rank matrix with singular value decomposition \(A = U \Sigma V^H\). Let \(\sigma_1\) be its largest singular value and \(\sigma_n\) be its smallest singular value. Then

\[\sigma_n^2 \|x \|_2 \leq \| A^H A x \|_2 \leq \sigma_1^2 \|x \|_2 \Forall x \in \FF^n.\]
Proof
\[A^H A = (U \Sigma V^H)^H (U \Sigma V^H) = V \Sigma^H \Sigma V^H.\]

Let \(x \in \FF^n\). Let

\[u = V^H x \implies \| u \|_2 = \|x \|_2.\]

Let

\[r = \Sigma^H \Sigma u.\]

Then from previous lemma we have

\[\sigma_n^2 \| u \|_2 \leq \| \Sigma^H \Sigma u \|_2 = \|r \|_2 \leq \sigma_1^2 \| u \|_2 .\]

Finally

\[A^ H A x = V \Sigma^H \Sigma V^H x = V r.\]

Thus

\[\| A^ H A x \|_2 = \|r \|_2.\]

Substituting we get

\[\sigma_n^2 \|x \|_2 \leq \| A^H A x \|_2 \leq \sigma_1^2 \|x \|_2 \Forall x \in \FF^n.\]

There are bounds for the inverse of Gram matrix also. First let us establish the inverse of Gram matrix.

Lemma

Let \(A\) be a full column rank matrix with singular value decomposition \(A = U \Sigma V^H\). Let the singular values of \(A\) be \(\sigma_1, \dots, \sigma_n\). Let the Gram matrix of columns of \(A\) be \(G = A^H A\). Then

\[G^{-1} = V \Psi V^H\]

where

\[\Psi = \Diag \left(\frac{1}{\sigma_1^2}, \frac{1}{\sigma_2^2}, \dots, \frac{1}{\sigma_n^2} \right).\]
Proof

We have

\[G = V \Sigma^H \Sigma V^H\]

Thus

\[G^{-1} = \left (V \Sigma^H \Sigma V^H \right )^{-1} = \left ( V^H \right )^{-1} \left ( \Sigma^H \Sigma \right )^{-1} V^{-1} = V \left ( \Sigma^H \Sigma \right )^{-1} V^H.\]

From here we have

\[\Psi = \left ( \Sigma^H \Sigma \right )^{-1} = \Diag \left (\frac{1}{\sigma_1^2}, \frac{1}{\sigma_2^2}, \dots, \frac{1}{\sigma_n^2} \right).\]

This completes the proof.

We can now state the bounds:

Lemma

Let \(A\) be a full column rank matrix with singular value decomposition \(A = U \Sigma V^H\). Let \(\sigma_1\) be its largest singular value and \(\sigma_n\) be its smallest singular value. Then

\[\frac{1}{\sigma_1^2} \|x \|_2 \leq \| \left(A^H A \right)^{-1} x \|_2 \leq \frac{1}{\sigma_n^2} \|x \|_2 \Forall x \in \FF^n.\]
Proof

From here we have

\[G^{-1} = \left ( A^H A \right)^{-1} = V \Psi V^H\]

where

\[\Psi = \Diag \left(\frac{1}{\sigma_1^2}, \frac{1}{\sigma_2^2}, \dots, \frac{1}{\sigma_n^2} \right).\]

Let \(x \in \FF^n\). Let

\[u = V^H x \implies \| u \|_2 = \|x \|_2.\]

Let

\[r = \Psi u.\]

Then

\[\| r \|_2^2 = \sum_{i=1}^n \left | \frac{1}{\sigma_i^2} u_i \right |^2.\]

Thus

\[\frac{1}{\sigma_1^2} \| u \|_2 \leq \| \Psi u \|_2 = \|r \|_2 \leq \frac{1}{\sigma_n^2} \| u \|_2 .\]

Finally

\[\left (A^ H A \right)^{-1} x = V \Psi V^H x = V r.\]

Thus

\[\| \left (A^ H A \right)^{-1} x \|_2 = \|r \|_2.\]

Substituting we get the result.

Low rank approximation of a matrix

Definition

An \(m \times n\) matrix \(A\) is called low rank if

\[\Rank(A) \ll \min (m, n).\]
Remark
A matrix is low rank if the number of non-zero singular values for the matrix is much smaller than its dimensions.

Following is a simple procedure for making a low rank approximation of a given matrix \(A\).

  1. Perform the singular value decomposition of \(A\) given by \(A = U \Sigma V^H\).
  2. Identify the singular values of \(A\) in \(\Sigma\).
  3. Keep the first \(r\) singular values (where \(r \ll \min(m, n)\) is the rank of the approximation) and set all other singular values to 0 to obtain \(\widehat{\Sigma}\).
  4. Compute \(\widehat{A} = U \widehat{\Sigma} V^H\).