INRIA Rennes | PANAMA team

SketchMLbox

A MATLAB toolbox for large-scale mixture learning


Purpose :

The SketchMLbox is a Matlab toolbox for fitting mixture models to large databases using sketching techniques.
The database is first compressed into a vector called sketch, then a mixture model (e.g. a Gaussian Mixture Model) is estimated from this sketch using greedy algorithms typical of sparse recovery.
The size of the sketch does not depend on the number of elements in the database, but rather on the complexity of the problem at hand [2,3]. Its computation can be massively parallelized and distributed over several units. It can also be maintained in an online setting at low cost.

Mixtures of Diracs ("K-means") and Gaussian Mixture Models with diagonal covariance are currently available, the toolbox is structured so that new mixture models can be easily implemented.

Details can be found in the following papers:
[1] Keriven N., Bourrier A., Gribonval R., Pérèz P., "Sketching for Large-Scale Learning of Mixture Models", ICASSP 2016.
[2] Keriven N., Bourrier A., Gribonval R., Pérèz P., "Sketching for Large-Scale Learning of Mixture Models", 2016. arXiv:1606.02838 (extended version)
[3] Keriven N., Tremblay N., Traonmilin Y., Gribonval R., "Compressive K-means", ICASSP 2017.
[4] Gribonval R., Blanchard G., Keriven N., Traonmilin Y., "Compressive Statistical Learning with Random Feature Moments", 2017. arXiv:1706.07180.


Download:

Download the v1.0 here.


Contact:

nicolas.keriven[at]gmail[dot]com