ADMM for Computing Subspace Preserving Representations¶

Linear Subspaces
Affine Subspaces
Linear Subspaces with Noise
Linear Subspaces with Noise and Outliers
Affine Subspaces with Noise
Affine Subspaces with Noise and Outliers

Linear Subspaces ¶

We solve the program

\[\min \| C \|_{1,1} \quad \text{ s.t. } Y = Y C, \; \Diag(C) = 0.\]

Affine Subspaces ¶

We solve the program

\[\min \| C \|_{1,1} \quad \text{ s.t. } Y = Y C, \; C^T \OneVec = \OneVec, \; \Diag(C) = 0.\]

Linear Subspaces with Noise ¶

We solve the program

\[\min \| C \|_{1,1} + \frac{\lambda}{2} \| Z \|_F^2 \quad \text{ s.t. } Y = Y C + Z, \; \Diag(C) = 0.\]

Eliminate Z from the program

\[\min \| C \|_{1,1} + \frac{\lambda}{2} \| Y - YC \|_F^2 \quad \text{ s.t. } \Diag(C) = 0.\]

Introduce an auxiliary matrix \(A\) and consider the program

\[\min \| C \|_{1,1} + \frac{\lambda}{2} \| Y - YA \|_F^2 \quad \text{ s.t. } A = C - \Diag(C).\]

The two programs have same solution.

Adding the penalty terms for equality constraints

\[\min \| C \|_{1,1} + \frac{\lambda}{2} \| Y - YA \|_F^2 + \frac{\rho}{2} \| A - (C - \Diag(C)) \|_F^2 \quad \text{ s.t. } A = C - \Diag(C).\]

Adding the penalty term doesn’t change the solution. The penalty term vanishes for any feasible solution. The objective function is now strictly convex.

Constructing the augmented Lagrangian

\[\LLL(C, A, \Delta) =\| C \|_{1,1} + \frac{\lambda}{2} \| Y - YA \|_F^2 + \frac{\rho}{2} \| A - (C - \Diag(C)) \|_F^2 + \Trace(\Delta^T (A - (C - \Diag(C)))).\]

The ADMM algorithm would consist of

Update \(A^{k+1}\) based on \((C^k, \Delta^k)\)
Update \(C^{k+1}\) based on \((A^{k+1}, \Delta^k)\)
Update \(\Delta^{k+1}\) based on \((A^{k+1}, C^{k+1})\)

\(A\)-update

\(\LLL(C, A, \Delta)\) is a quadratic in \(A\). Differentiating w.r.t. \(A\) gives us:

\[\lambda Y^T (YA - Y) + \rho (A - (C - \Diag(C))) + \Delta = 0.\]

Rearranging the terms we get:

\[(\lambda Y^T Y + \rho I )A = \lambda Y^T Y + \rho (C - \Diag(C)) - \Delta.\]

Note that, by construction, we will ensure that \(\Diag(C) = 0\) for each iteration. Thus, \(A\) is updated by solving the equation:

\[(\lambda Y^T Y + \rho I )A^{k+1} = \lambda Y^T Y + \rho C^k - \Delta^k.\]

\(C\) update

Let’s rewrite the augmented Lagrangian by removing the \(\Diag(C)\) terms as:

\[\LLL(C, A, \Delta) = \min \| C \|_{1,1} + \frac{\lambda}{2} \| Y - YA \|_F^2 + \frac{\rho}{2} \| A - C \|_F^2 + \Trace(\Delta^T (A - C )).\]

Remove the terms which don’t depend on \(C\)

\[\min \| C \|_{1,1} + \frac{\rho}{2} \| C - A\|_F^2 - \Trace(\Delta^T C ).\]

A closed-form solution for the minimizer of this expression is

\[J = S_{\frac{1}{\rho}} (A^{k+1} + \Delta^k / \rho)\]

where \(S\) is the element wise soft thresholding or shrinkage operator.

We obtain \(C\) from \(J\) by:

\[C^{k+1} = J - \Diag(J).\]

Note that this construction ensures that \(\Diag(C)\) is always 0.

\(\Delta\) update

We perform the gradient ascent update of \(\Delta\) as

\[\Delta^{k+1} = \Delta^k + \rho (A^{k+1} - C^{k+1}).\]

Linear Subspaces with Noise and Outliers ¶

We solve the program

\[\min \| C \|_{1,1} + \lambda_e \| E \|_{1,1} + \frac{\lambda_z}{2} \| Z \|_F^2 \quad \text{ s.t. } Y = Y C + E + Z, \; \Diag(C) = 0.\]

Affine Subspaces with Noise ¶

We solve the program

\[\min \| C \|_{1,1} + \frac{\lambda}{2} \| Z \|_F^2 \quad \text{ s.t. } Y = Y C + Z, \; C^T \OneVec = \OneVec, \; \Diag(C) = 0.\]

Eliminate Z from the program

\[\min \| C \|_{1,1} + \frac{\lambda}{2} \| Y - YC \|_F^2 \quad \text{ s.t. } C^T \OneVec = \OneVec, \; \Diag(C) = 0.\]

Introduce an auxiliary matrix \(A\) and consider the program

\[\min \| C \|_{1,1} + \frac{\lambda}{2} \| Y - YA \|_F^2 \quad \text{ s.t. } A^T \OneVec = \OneVec, \; A = C - \Diag(C).\]

The two programs have same solution.

Adding the penalty terms for equality constraints

\[\min \| C \|_{1,1} + \frac{\lambda}{2} \| Y - YA \|_F^2 + \frac{\rho}{2} \| A^T \OneVec - \OneVec \|_2^2 + \frac{\rho}{2} \| A - (C - \Diag(C)) \|_F^2 \quad \text{ s.t. } A^T \OneVec = \OneVec, \; A = C - \Diag(C).\]

Adding the penalty term doesn’t change the solution. The penalty term vanishes for any feasible solution. The objective function is now strictly convex.

Constructing the augmented Lagrangian by adding the multipliers for the two equality terms

\[\begin{split}\begin{aligned} \LLL(C, A, \delta, \Delta) &= \| C \|_{1,1} + \frac{\lambda}{2} \| Y - YA \|_F^2\\ &+ \frac{\rho}{2} \| A^T \OneVec - \OneVec \|_2^2 + \frac{\rho}{2} \| A - (C - \Diag(C)) \|_F^2\\ &+ \delta^T (A^T \OneVec - \OneVec) + \Trace(\Delta^T (A - (C - \Diag(C)))). \end{aligned}\end{split}\]

The ADMM algorithm would consist of

Update \(A^{k+1}\) based on \((C^k, \Delta^k)\)
Update \(C^{k+1}\) based on \((A^{k+1}, \Delta^k)\)
Update \(\delta^{k+1}\) based on \((A^{k+1}, C^{k+1})\)
Update \(\Delta^{k+1}\) based on \((A^{k+1}, C^{k+1})\)

\(A\)-update

Differentiating w.r.t. \(A\) gives us:

\[\lambda Y^T (YA - Y) + \rho (\OneVec \OneVec^T) A - \rho \OneVec \OneVec^T + \rho (A - (C - \Diag(C))) + \OneVec \delta^T + \Delta = 0.\]

\[(\lambda Y^T Y + \rho I + \rho \OneVec \OneVec^T)A = \lambda Y^T Y + \rho \OneVec \OneVec^T + \rho (C - \Diag(C)) - \OneVec \delta^T- \Delta.\]

By construction, we will ensure that \(\Diag(C) = 0\) for each iteration. Thus, \(A\) is updated by solving the equation:

\[(\lambda Y^T Y + \rho I + \rho \OneVec \OneVec^T)A^{k+1} = \lambda Y^T Y + \rho (\OneVec \OneVec^T + C^k) - \OneVec {\delta^k}^T - \Delta^k.\]

\(C\) update

Let’s rewrite the augmented Lagrangian by removing the \(\Diag(C)\) terms as:

\[\begin{split}\begin{aligned} \LLL(C, A, \delta, \Delta) &= \| C \|_{1,1} + \frac{\lambda}{2} \| Y - YA \|_F^2\\ &+ \frac{\rho}{2} \| A^T \OneVec - \OneVec \|_2^2 + \frac{\rho}{2} \| A - C) \|_F^2\\ &+ \delta^T (A^T \OneVec - \OneVec) + \Trace(\Delta^T (A - C))). \end{aligned}\end{split}\]

Remove the terms which don’t depend on \(C\)

\[\| C \|_{1,1} + \frac{\rho}{2} \| A - C) \|_F^2 - \Trace(\Delta^T C).\]

A closed-form solution for the minimizer of this expression is

\[J = S_{\frac{1}{\rho}} (A^{k+1} + \Delta^k / \rho)\]

We obtain \(C\) from \(J\) by:

\[C^{k+1} = J - \Diag(J).\]

\(\delta\) and \(\Delta\) update

We perform the gradient ascent update of \(\delta\) as

\[\delta^{k+1} = \delta^k + \rho ({A^{k+1}}^T \OneVec - \OneVec).\]

We perform the gradient ascent update of \(\Delta\) as

\[\Delta^{k+1} = \Delta^k + \rho (A^{k+1} - C^{k+1}).\]

In summary, following changes were observed in ADMM iterations:

A update step was adjusted.
\(\delta\) update was introduced.

Affine Subspaces with Noise and Outliers ¶

We solve the program

\[\min \| C \|_{1,1} + \lambda_e \| E \|_{1,1} + \frac{\lambda_z}{2} \| Z \|_F^2 \quad \text{ s.t. } Y = Y C + E + Z, \; C^T \OneVec = \OneVec, \; \Diag(C) = 0.\]

ADMM for Computing Subspace Preserving Representations¶

Linear Subspaces¶

Affine Subspaces¶

Linear Subspaces with Noise¶

Linear Subspaces with Noise and Outliers¶

Affine Subspaces with Noise¶

Affine Subspaces with Noise and Outliers¶

Linear Subspaces ¶

Affine Subspaces ¶

Linear Subspaces with Noise ¶

Linear Subspaces with Noise and Outliers ¶

Affine Subspaces with Noise ¶

Affine Subspaces with Noise and Outliers ¶