Then, we applied the mixture of Gaussians algorithm to find 20 clusters. Finally, we inspect which faces are in which cluster. Because of we included horizontally symmetric features in the training data, not all clusters contain actual data points.
This approach has an interesting side effect, which is to cluster together very different head poses that have similar appearances, i.e. the eyes, nose, and mouth are in similar poisitions, even though the head is in a very different pose. This may be desirable for training a face detector, because it can focus on features like eyes and mouth.
The goal of this clustering is to improve the algorithm for choosing face boxes based on feature positions, so that the training images are maximally similar to eachother. This is expected to improve the ease of training a cascade.