Utility Functions for Clustering Experiments¶
We provide some utility functions which are quite useful in setting up clustering experiments.
Suppose you stack data vectors from different clusters together in a matrix column-wise. You wish to assign labels to each column of the matrix. We provide a function to automatically choose such labels.
Let’s choose some cluster sizes:
>> cluster_sizes = [ 4 3 3 2];
Let’s generate labels for these clusters:
>> labels = spx.cluster.labels_from_cluster_sizes(cluster_sizes)
labels =
1 1 1 1 2 2 2 3 3 3 4 4
Notice how first 4 labels are 1, next 3 labels are 2, next 3 are 3 and final 2 are 4.
Let’s randomly reorder the labels. This is a typical step in feeding a clustering algorithm so that any inherent order in data is destroyed before applying the clustering algorithm.
>> labels = labels(randperm(numel(labels)))
labels =
3 2 2 3 4 3 1 1 4 1 2 1
A useful function is to find the sizes of clusters for each label. We provide a function for that:
>> spx.cluster.cluster_sizes_from_labels(labels)
ans =
4 3 3 2