Inner product spaces

Inner product

Inner product is a generalization of the notion of dot product.

Definition

An inner product over a \(K\)-vector space \(V\) is any map

\[\begin{split}\begin{aligned} \langle, \rangle : &V \times V \to K (\RR \text{ or } \CC )\\ & (v_1, v_2) \to \langle v_1, v_2 \rangle \end{aligned}\end{split}\]

satisfying following requirements:

  1. Positive definiteness

    (1)\[ \langle v, v \rangle \geq 0 \text{ and } \langle v, v \rangle = 0 \iff v = 0\]
  2. Conjugate symmetry

    (2)\[ \langle v_1, v_2 \rangle = \overline{\langle v_2, v_1 \rangle} \quad \forall v_1, v_2 \in V\]
  3. Linearity in the first argument

    (3)\[\begin{split} \begin{aligned} &\langle \alpha v, w \rangle = \alpha \langle v, w \rangle \quad \forall v, w \in V; \forall \alpha \in K\\ &\langle v_1 + v_2, w \rangle = \langle v_1, w \rangle + \langle v_2, w \rangle \quad \forall v_1, v_2,w \in V \end{aligned}\end{split}\]

Remarks

  • Linearity in first argument extends to any arbitrary linear combination:
\[\left \langle \sum \alpha_i v_i, w \right \rangle = \sum \alpha_i \langle v_i, w \rangle\]
  • Similarly we have conjugate linearity in second argument for any arbitrary linear combination:
\[\left \langle v, \sum \alpha_i w_i \right \rangle = \sum \overline{\alpha_i} \langle v, w_i \rangle\]

Orthogonality

Definition

A set of non-zero vectors \(\{v_1, \dots, v_p\}\) is called orthogonal if

\[\langle v_i, v_j \rangle = 0 \text{ if } i \neq j \quad \forall 1 \leq i, j \leq p\]
Definition

A set of non-zero vectors \(\{v_1, \dots, v_p\}\) is called orthonormal if

(4)\[\begin{split}\begin{aligned} &\langle v_i, v_j \rangle = 0 \text{ if } i \neq j \quad \forall 1 \leq i, j \leq p\\ &\langle v_i, v_i \rangle = 1 \quad \forall 1 \leq i \leq p \end{aligned}\end{split}\]

i.e. \(\langle v_i, v_j \rangle = \delta(i, j)\).

Remarks:

  • A set of orthogonal vectors is linearly independent. Prove!
Definition
A \(K\)-vector space \(V\) equipped with an inner product \(\langle, \rangle : V \times V \to K\) is known as an inner product space or a pre-Hilbert space.

Norm

Norms are a generalization of the notion of length.
Definition

A norm over a \(K\)-vector space \(V\) is any map

\[\begin{split}\begin{aligned} \| \| : &V \to \RR \\ & v \to \| v\| \end{aligned}\end{split}\]

satisfying following requirements:

  1. Positive definiteness

    (5)\[ \| v\| \geq 0 \quad \forall v \in V \text{ and } \| v\| = 0 \iff v = 0\]
  2. Scalar multiplication

    \[\| \alpha v \| = | \alpha | \| v \| \quad \forall \alpha \in K; \forall v \in V\]
  3. Triangle inequality

    \[\| v_1 + v_2 \| \leq \| v_1 \| + \| v_2 \| \quad \forall v_1, v_2 \in V\]
Definition
A \(K\)-vector space \(V\) equipped with a norm \(\| \| : V \to \RR\) is known as a normed linear space.

Projection

Definition

A projection is a linear transformation \(P\) from a vector space \(V\) to itself such that \(P^2=P\). i.e. if \(P v = \beta\), then \(P \beta = \beta\). Thus whenever \(P\) is applied twice to any vector, it gives the same result as if it was applied once.

Thus \(P\) is an idempotent operator.

ExampleProjection operators

Consider the operator \(P : \RR^3 \to \RR^3\) defined as

\[\begin{split}P = \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{bmatrix}.\end{split}\]

Then application of \(P\) on any arbitrary vector is given by

\[\begin{split}P \begin{pmatrix} x \\ y \\z \end{pmatrix} = \begin{pmatrix} x \\ y \\ 0 \end{pmatrix}\end{split}\]

A second application doesn’t change it

\[\begin{split}P \begin{pmatrix} x \\ y \\0 \end{pmatrix} = \begin{pmatrix} x \\ y \\ 0 \end{pmatrix}\end{split}\]

Thus \(P\) is a projection operator.

Usually we can directly verify the property by computing \(P^2\) as

\[\begin{split}P^2 = \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{bmatrix} = P.\end{split}\]

Orthogonal projection

Consider a projection operator \(P : V \to V\) where \(V\) is an inner product space.

The range of \(P\) is given by

\[\Range(P) = \{v \in V | v = P x \text{ for some } x \in V \}.\]

The null space of \(P\) is given by

\[\NullSpace(P) = \{ v \in V | P v = 0\}.\]
Definition

A projection operator \(P : V \to V\) over an inner product space \(V\) is called orthogonal projection operator if its range \(\Range(P)\) and the null space \(\NullSpace(P)\) as defined above are orthogonal to each other. i.e.

\[\langle r, n \rangle = 0 \Forall r \in \Range(P) , \Forall n \in \NullSpace(P).\]
Lemma
A projection operator is orthogonal if and only if it is self adjoint.
ExampleOrthogonal projection on a line

Consider a unit norm vector \(u \in \RR^N\). Thus \(u^T u = 1\).

Consider

\[P_u = u u^T.\]

Now

\[P_u^2 = (u u^T) (u u^T) = u (u^T u) u^T = u u^T = P.\]

Thus \(P\) is a projection operator.

Now

\[P_u^T = (u u^T)^T = u u^T = P_u\]

Thus \(P_u\) is self-adjoint. Hence \(P_u\) is an orthogonal projection operator.

Now

\[P_u u = (u u^T) u = u (u^T u) = u.\]

Thus \(P_u\) leaves \(u\) intact. i.e. Projection of \(u\) on to \(u\) is \(u\) itself.

Let \(v \in u^{\perp}\) i.e. \(\langle u, v \rangle = 0\).

Then

\[P_u v = (u u^T) v = u (u^T v) = u \langle u, v \rangle = 0.\]

Thus \(P_u\) annihilates all vectors orthogonal to \(u\).

Now any vector \(x \in \RR^N\) can be broken down into two components

\[x = x_{\parallel} + x_{\perp}\]

such that \(\langle u , x_{\perp} \rangle =0\) and \(x_{\parallel}\) is collinear with \(u\).

Then

\[P_u x = u u^T x_{\parallel} + u u^T x_{\perp} = x_{\parallel}.\]

Thus \(P_u\) retains the projection of \(x\) on \(u\) given by \(x_{\parallel}\).

ExampleProjections over the column space of a matrix

Let \(A \in \RR^{M \times N}\) with \(N \leq M\) be a matrix given by

\[A = \begin{bmatrix} a_1 & a_2 & \dots & a_N \end{bmatrix}\]

where \(a_i \in \RR^M\) are its columns which are linearly independent.

The column space of \(A\) is given by

\[C(A) = \{ A x \Forall x \in \RR^N \} \subseteq \RR^M.\]

It can be shown that \(A^T A\) is invertible.

Consider the operator

\[P_A = A (A^T A)^{-1} A^T.\]

Now

\[P_A^2 = A (A^T A)^{-1} A^T A (A^T A)^{-1} A^T = A (A^T A)^{-1} A^T = P_A.\]

Thus \(P_A\) is a projection operator.

\[P_A^T = (A (A^T A)^{-1} A^T)^T = A ((A^T A)^{-1} )^T A^T = A (A^T A)^{-1} A^T = P_A.\]

Thus \(P_A\) is self-adjoint.

Hence \(P_A\) is an orthogonal projection operator on the column space of \(A\).

Parallelogram identity

Theorem
\[2 \| x \|_2^2 + 2 \| y \|_2^2 = \|x + y \|_2^2 + \| x - y \|_2^2. \Forall x, y \in V.\]
Proof
\[\| x + y \|_2^2 = \langle x + y, x + y \rangle = \langle x, x \rangle + \langle y , y \rangle + \langle x , y \rangle + \langle y , x \rangle.\]
\[\| x - y \|_2^2 = \langle x - y, x - y \rangle = \langle x, x \rangle + \langle y , y \rangle - \langle x , y \rangle - \langle y , x \rangle.\]

Thus

\[\|x + y \|_2^2 + \| x - y \|_2^2 = 2 ( \langle x, x \rangle + \langle y , y\rangle) = 2 \| x \|_2^2 + 2 \| y \|_2^2.\]

When inner product is a real number following identity is quite useful.

Theorem
\[\langle x, y \rangle = \frac{1}{4} \left ( \|x + y \|_2^2 - \| x - y \|_2^2 \right ). \Forall x, y \in V.\]
Proof
\[\| x + y \|_2^2 = \langle x + y, x + y \rangle = \langle x, x \rangle + \langle y , y \rangle + \langle x , y \rangle + \langle y , x \rangle.\]
\[\| x - y \|_2^2 = \langle x - y, x - y \rangle = \langle x, x \rangle + \langle y , y \rangle - \langle x , y \rangle - \langle y , x \rangle.\]

Thus

\[\|x + y \|_2^2 - \| x - y \|_2^2 = 2 ( \langle x , y \rangle + \langle y , x \rangle) = 4 \langle x , y \rangle\]

since for real inner products

\[\langle x , y \rangle = \langle y , x \rangle.\]

Polarization identity

When inner product is a complex number, polarization identity is quite useful.

Theorem
\[\langle x, y \rangle = \frac{1}{4} \left ( \|x + y \|_2^2 - \| x - y \|_2^2 + i \| x + i y \|_2^2 - i \| x -i y \|_2^2 \right ) \Forall x, y \in V.\]
Proof
\[\| x + y \|_2^2 = \langle x + y, x + y \rangle = \langle x, x \rangle + \langle y , y \rangle + \langle x , y \rangle + \langle y , x \rangle.\]
\[\| x - y \|_2^2 = \langle x - y, x - y \rangle = \langle x, x \rangle + \langle y , y \rangle - \langle x , y \rangle - \langle y , x \rangle.\]
\[\| x + i y \|_2^2 = \langle x + i y, x + i y \rangle = \langle x, x \rangle + \langle i y , i y \rangle + \langle x , i y \rangle + \langle i y , x \rangle.\]
\[\| x - i y \|_2^2 = \langle x - i y, x - i y \rangle = \langle x, x \rangle + \langle i y , i y \rangle - \langle x , i y \rangle - \langle i y , x \rangle.\]

Thus

\[\begin{split} \|x + y \|_2^2 - \| x - y \|_2^2 + & i \| x + i y \|_2^2 - i \| x -i y \|_2^2\\ &= 2 \langle x, y \rangle + 2 \langle y , x \rangle + 2 i \langle x , i y \rangle + 2 i \langle ix , y \rangle\\ &= 2 \langle x, y \rangle + 2 \langle y , x \rangle + 2 \langle x, y \rangle - 2 \langle y , x \rangle\\ & = 4 \langle x, y \rangle.\end{split}\]