Generalized PCA

Generalized Principal Component Analysis (GPCA) is algebraic subspace clustering technique based on polynomial fitting and differentiation [VMS03][VH04][HMV04][VMS05][VTH08]. The basic idea is that a union of subspaces can be represented as a zero set of a set of homogeneous polynomials. Once the set of polynomials has been fitted for the given dataset, individual component subspaces can be identified via polynomial differentiation and division. See here for a quick review of ideas from algebraic geometry which are used in the development of GPCA algorithm.

We will assume that \(\UUU_k\) are linear subspaces. If they are affine, we simply take their homogeneous embeddings.

Representing the union of subspaces with a set of homogeneous polynomials

Consider the \(k\)-th subspace \(\UUU_k \subset \RR^M\) with dimension \(D_k\) and its orthogonal complement \(\UUU_k^{\perp}\) with dimension \(D'_k = M - D_k\). Choose a basis for \(\UUU_k^{\perp}\) as:

\[B_k = [b_{k_1}, \dots, b_{k_{D'_k}}] \in \RR^{M \times D'_k}.\]

Recall that for each \(y \in \UUU_k\), \(b_{k_i}^T y = 0\) as vectors in \(\UUU_k^{\perp}\) are orthogonal to vectors in \(\UUU_k\). Note that each of the forms \(b_{k_i}^T y\) is a homogeneous polynomial of degree 1. The solutions of \(b_{k_i}^T y = 0\) are (linear) hyperplanes of dimension \(M-1\) and the subspace \(\UUU_k\) is the intersection of these hyperplanes. In other words:

\[\begin{split}\begin{aligned} \UUU_k &= \{ y \in \RR^M : B_k^T y = 0 \}\\ &= \left \{ y \in \RR^M : \bigwedge_{i=1}^{D'_k} (b_{k_i}^T y = 0) \right \} . \end{aligned}\end{split}\]

Note that \(y \in Z_{\UUU}\) if and only if \((y \in \UUU_1) \vee \dots \vee (y \in \UUU_K)\). Alternatively:

\[\bigvee_{k=1}^K (y\in \UUU_k) \Leftrightarrow \bigvee_{k=1}^K \bigwedge_{j=1}^{D'_k} (b_{k_j}^T y = 0) \Leftrightarrow \bigwedge_{\sigma} \bigvee_{k=1}^K (b_{k_{\sigma(k)}}^T y = 0)\]

where \(\sigma\) denotes an arbitrary choice of one normal vector \(b_{k_{\sigma(k)}}\) from each basis \(B_k\) and we are considering all such choices. If \(y\in Z_{\UUU}\), it belongs to some \(\UUU_k\), and \((b_{k_i}^T y = 0)\) for each \(b_{k_i}\) in \(B_k\). Hence, for each choice \(\sigma\), \(b_{k_{\sigma(k)}}^T y = 0\) and RHS is true. Conversely, assume RHS is true. If \(y \notin Z_{\UUU}\), then from each \(B_k\), we could pick one normal vector \(b\) such that \(b^T y \neq 0\). This choice would make RHS false, hence \(y \in Z_{\UUU}\). The total number of choices \(\sigma\) is: \(\prod_{k=1}^K D'_k\). Interestingly:

\[\bigvee_{k=1}^K (b_{k_{\sigma(k)}}^T y = 0) \Leftrightarrow \left ( \prod_{k=1}^K (b_{k_{\sigma(k)}}^T y) = 0\right ) \iff (p^K_{\sigma}(y) = 0)\]

where \(p^K_{\sigma}(y)\) is a homogeneous polynomial of degree \(K\) in \(M\) variables.

Therefore, A union of \(K\) subspaces can be represented as the zero set of a set of homogeneous polynomials of the form:

(1)\[p^K(y) = \prod_{k=1}^K (b_k^T y ) = c_K^T v_K(y),\]

where \(b_k \in \RR^M\) is a normal vector to the \(k\)-th subspace and \(v_K(y)\) is the Veronese embedding (see here) of \(y \in \RR^M\) into \(\RR^{A_{K}(M)}\). The problem of fitting \(K\) subspaces to the given dataset is then equivalent the problem of fitting homogeneous polynomials \(p^K(y)\) such that all the points in the dataset belong to the zero set of these polynomials. Fitting of such polynomials doesn’t require iterative data segmentation and model estimation since they depend on all the points in the dataset. Once, the polynomials have been identified, the remaining task is to split their zero-set into individual subspaces identified by \(B_k\).

In the following, we assume that the number of subspaces \(K\) is known beforehand. We consider the task of estimating \(K\) later.

Fitting polynomials to data

Let \(I(Z_{\UUU})\) be the vanishing ideal of \(Z_{\UUU}\). Since, the number of subspaces \(K\) is known, we only need to consider the homogeneous component \(I_K\) of \(I(Z_{\UUU})\) (3).

The vanishing ideal \(I(\UUU_K)\) of \(\UUU_k\) is generated by the set of linear forms

\[\GGG_k = \{l(y) = b^T y, b \in B_k \}.\]

If the subspace arrangement is transversal, \(I_K\) is generated by products of \(K\) linear forms that vanish on the \(K\) subspaces. Any polynomial \(p(y) \in I_K\) can be written as a summation of products of linear forms

\[p(y) = \sum l_1 (y) l_2(y) \dots l_K(y)\]

where \(l_k(y)\) is a linear form in \(I(\UUU_k)\). Using the Veronese map, each polynomial in \(I_K\) can also be written as:

\[p(y) = c_K^T v_K(y) = \sum c_{k_1, \dots, k_M} y_1^{k_1} \dots y_M^{k_M} = 0\]

where \(k_1 + \dots + k_M = K\) and \(c_{k_1, \dots, k_M} \in \RR\) represents the coefficient of monomial \(y^{\underline{K}} = y_1^{k_1} \dots y_M^{k_M}\). Fitting the polynomial \(p(y)\) is equivalent to identifying its coefficient vector \(c_K\). Since \(p(y) = 0\) is satisfied by each data point \(y_s \in Y\), we have \(c_K^T v_K(y_s) = 0\) for all \(s = 1, \dots, S\). We define

\[\begin{split}V_K(M) = \begin{bmatrix} v_K(y_1)^T\\ \vdots\\ v_K(y_S)^T \end{bmatrix} \in \RR^{S \times A_K(M) }\end{split}\]

as embedded data matrix. Then, we have

\[V_K(M) c_K = 0 \in \RR^S.\]

The coefficient vector \(c_K\) of every polynomial in \(I_K\) is in the null space of \(V_K(M)\). To ensure that every polynomial obtained from \(V_K(M)\) is in \(I_K\), we require that

\[\text{dim} (\NullSpace (V_K(M))) = \text{dim} (I_K) = h_I(K)\]

where \(h_I\) is the Hilbert function of \(I(Z_{\UUU})\) (2). Equivalently, the rank of \(V_K(M)\) needs to satisfy:

\[\text{rank}(V_K(M)) = A_K(M) - h_I(K).\]

This condition is typically satisfied with \(S \geq (A_K(M) - 1)\) points in general position. Assuming this, a basis for \(I_K\) can be constructed from the set of \(h_I(K)\) singular vectors of \(V_K(M)\) associated with its \(h_I(K)\) zero singular values. In the presence of moderate noise, we can still estimate the coefficients of the polynomials in the least squares sense from the singular vectors associated with the \(h_I(K)\) smallest singular values.

Subspaces by polynomial differentiation

Now that we have obtained a basis for the polynomials in \(I_K\), the next step is to calculate the basis vectors \(B_k\) for each \(\UUU_k^{\perp}\).