Algorithms

Here we explain the setup and algorithm when the data points reside on linear subspaces, the algorithm can also cluster affine subspaces with simple modifications (see Section 2.2 of Robust Subspace Clustering for further detail).

Setup and notation points in dimensions normalized to have unit Euclidean norm .

Noisy problem • : clean data points. Columns of belong to union of subspaces of unknown dimensions .

• : noise. Each column has bounded norm . Missing data problem • : clean data points. Columns of belong to union of subspaces of unknown dimensions .

• is an operator that zeros out each entry with probability . General Structure

The general structure of the algorithm follows the standard machine:

1. Construct affinity matrix between samples producing a weighted graph.

2. Construct clusters by applying spectral clustering to .

3. Apply PCA to each cluster.

Following Sparse Subspace Clustering (SSC) of Elhamifar and Vidal, we use ideas from sparse representation theory and sparse regression to compute affinities.

Sparse regression

We first, calculate a sparse coefficient sequence obtained by regressing the th data point onto all other data points . The hope is that such a sparse representation of would only select vectors from the cluster in which belongs to. This is depicted in the figure below with each color denoting points from a different cluster.

We propose two strategies for the sparse regression step below.

• Two step procedure with data-driven regularization.

• Bias-corrected Dantzig Selector. Building a graph based on the sparse coefficients

We set After applying a permutation which makes sure that columns in the same cluster are contiguous we expect the affinity matrix to look like the figure below where each block corresponds to a different cluster. Two Step Procedure with Data Driven Regularization

For  We choose in a data driven fashion: Choice of parameters: .

Bias-corrected Dantzig Selector

Again regress one against the others this time by Dantzig Selector. subject to . and .

In practice we only get to observe the noisy versions and .

Solution: Build unbiased estimators for and . Set Finally, we arrive at at the following sequence of optimization problems. subject to .

Choice of parameters: .

Bias-corrected Dantzig Selector for missing data denotes the observations from the -th column of the clean data matrix  denotes the submatrix of with rows selected by .

Again regress one against the others this time by Dantzig Selector. subject to . and .

In practice we only get to observe a few entries.

Solution: Again build unbiased estimators for and . Set

We build a matrix based on the observed entries of as follows. Set if observed and if missing.

Set with the th row and column set to zero.

Set with the th row set to zero.

Finally, we arrive at subject to .

Choice of parameters: , where number of missing entries/total number of entries.