Page Content

Tutorials

What is sparse decomposition? and its Fundamental Goals

What is sparse decomposition?

A signal or data sample can be represented as a linear combination of a few basis elements (also known as atoms) from an overcomplete dictionary using the sparse decomposition technique, which is used in machine learning and signal processing. Finding the sparsest representation—that is, one in which the majority of the coefficients are zero or very close to zero—is the main idea. This helps with anomaly identification, feature extraction, compression, and denoising. L1 regularisation is frequently used to solve optimisation problems in sparse decomposition techniques (e.g., Lasso). Applications include image processing, audio separation, and compressed sensing, where only limited measurements are available but the underlying signal is sparse.

Fundamental Goals and Idea of sparse decomposition

The unsupervised creation of hierarchical feature sets from data, such photographs, is the main objective of sparse decomposition . In high-dimensional, redundant input, it aims to build robust and practical representations that capture important features of variation and stable structures.

At every level of a hierarchy, sparse decomposition promotes a parsimonious (sparse) representation, which facilitates features’ organic assembly into increasingly intricate structures. In particular, when the number of units in a layer is not absolutely decreasing, this is essential to prevent the model from learning trivial solutions, like just mapping the input to itself (identity function).

sparse decomposition

Mechanism and Architecture of sparse decomposition

One popular method for sparse decomposition is to use learnt filters to represent an input signal (such an image) as a linear sum of convolutions of latent feature maps.

To ensure sparsity, a regularisation term is added to the latent feature maps. This usually manifests as a p-norm (|z|p) on the cost function’s vectorised feature maps, where p is usually 1, though it can be any number. A unique solution to the otherwise under-determined system is produced in part by this sparsity restriction.

Intermediate visual ideas like curves, parallel lines, edge junctions, and fundamental geometric elements can all be represented by the learnt filters. Reconstruction error is typically reduced by the sparser features found in higher layers of a hierarchy.

Hierarchical Composition: The feature mappings from a lower layer can be used as the input for the layer above when sparse decomposition models are layered to create a hierarchy. This enables the model to pick up on more intricate details that are present in the image on a bigger scale. Although they could be included, techniques like Deconvolutional Networks (DNs) do not always carry out pooling, sub-sampling, or divisive normalisation operations between layers, in contrast to some other hierarchical models.

Connection to Other Models

Sparse Auto-Encoders: These algorithms leverage sparsity as a regularisation strategy on their latent feature maps to guarantee that usable representations are learnt. They are a straightforward application of sparse decomposition principles [Conversation History, 19, 330].
One particular framework that applies unsupervised sparse decomposition for hierarchical picture representations is called a deconvolutional network (DN).

Deconvolutional Networks (DNs): The lack of an encoder is a major way that DNs differ from models such as Deep Belief Networks (DBNs) and sparse auto-encoders. DNs use effective optimisation approaches to directly address the inference problem (identifying feature map activations) rather than depending on an encoder to roughly infer the latent representation. In contrast to approximate encoder-based approaches, the goal is to calculate features precisely in the hopes of producing better features.

A crucial distinction between DNs and other sparse decomposition techniques, such as those by Mairal et al. (often associated with sparse coding algorithms).

DNs differ from Convolutional Restricted Boltzmann Machines (RBMs) in that they employ a decoder-only model rather than the symmetric encoder-decoder structure of RBMs, and RBMs also have the additional restriction that their encoder and decoder must share weights. DNs carry out sparse decomposition over the entire image at once, whereas patch-based techniques learn sparse over-complete decompositions of image patches. Given that patch-based representations can be unstable and impede the learning of higher-order filters, this distinction is thought to be essential for learning rich features.

Another encoder-decoder design that has been influenced by sparse coding theories is predictive sparse decomposition (PSD).

Learning and Optimization of sparse decomposition

During training, the cost function is alternatively minimised over the feature maps (inference) and then over the filters.
The feature map minimisation problem is convex but frequently poorly conditioned, making it computationally difficult to converge to a suitable solution, particularly when an L1 sparsity term is present.

Several optimisation techniques have been investigated; however, iterative reweighted least squares (very slow), stochastic gradient descent (many iterations), and direct gradient descent (flat-lining) have been shown to have drawbacks.

To overcome these challenges, a “continuation method” is presented, which employs auxiliary variables and progressively adds the sparsity constraint for better numerical stability. A variety of sparsity norms, including pseudo-norms where p < 1, can be used with this technique.

Uses of sparse decomposition

Deconvolutional Networks and other sparse decomposition frameworks are employed for a number of tasks, such as object recognition and image denoising.

Deep neural networks perform better in classification when the learnt representations are used as an initialisation phase .

Index