ICTEAM > ELEN > ISPGroup > ISPS > Calendar > ABS151117


Title: "Compressive learning (e.g., clustering) from a (quantized) sketch of the dataset"

Speaker: Vincent Schellekens

Location: "Shannon" Seminar Room, Place du Levant 3, Maxwell Building, 1st floor

Date / Time (duration): Wednesday 15/11/2017, 11h00 (~45')


Abstract: Machine learning algorithms, such as the k-means clustering, typically require several passes on a dataset of learning examples (e.g., signals, images, data volumes). These must thus be all acquired, stored in memory, and read multiple times, which becomes prohibitive when the number of examples becomes very large. On the other hand, the machine learning model learned from this data (e.g. the centroids in k-means clustering) is usually simple and contains relatively few information compared to the size of the dataset.

Inspired by the field of Compressed Sensing, recent work has proposed a new technique to reduce the size of a very large dataset by summarizing it into a single object called the sketch that still contains enough information about this dataset for the target machine learning application. In practice, the proposed sketch computes the empirical mean of the complex exponential of random projections of the learning examples, in other words, the sketching operator is actually a random sampling of the characteristic function of the dataset (empirical) distribution.

This approach allows to perform machine learning tasks from a heavily compressed representation of the dataset, but all the signals in the dataset must still be acquired at full resolution to obtain their contribution to the sketch, and are then discarded: this is a waste of acquisition ressources. In the hope of designing a sensor that is capable of acquiring only the sketch of a signal and not the signal itself, this work considers the possibility to define a new sketching operator closer to a hardware implementation, that replaces the complex exponential with a 1-bit universal quantization function (i.e., a square wave), with as application a focus on the k-means clustering task. This modification of the sketch operator presents new algorithmic challenges and has an important impact on the meaning of the sketching operator.

Last updated November 15, 2017, at 04:28 PM