Natural data as found in biological signals or images is usually highly redundant and noisy. Classical models for the stochasticity in such processes break down in many such cases.
The signals of two emg electrodes attached to an arm plotted against each other. The arm was not moved during that time. Note how multiple outliers stand against a Gaussian assumption.
For example, due to the presence of edges in images, the gradients are giving rise to fat tailed distributions. On the other hand, it can easily be seen that multiple EMG signals are highly non-Gaussian.
In machine learning, we investigate methods for finding useful representations of natural data.
For this, we use parametric models which employ non-linear mappings. These are combined into deep and recurrent architectures which are subsequently optimized with classical and novel optimization techniques on a wide variety of objectives.
The objectives typically encourage the representations to fulfill some numerical criterion: sparsity, independence, clustering of similar items or the ability to reconstruct the input.
The models we use include but are not limited to Independent Components Analysis, Restricted Boltzmann Machines, Autoencoders, Deep Belief Networks and Recurrent Neural Networks.
A commonly used architecture for unsupervised feature extraction. Each input vector (blue) is mapped to a feature vector (white) via a linear mapping and a subsequent element-wise non-linearity.