Page Content

Tutorials

What are Generative Stochastic Networks? and Advantages of GSNs

What are Generative Stochastic Networks?

Generative Stochastic Networks (GSNs) are a type of generative model introduced by Yoshua Bengio and colleagues in 2014. GSNs are designed to learn a data distribution and generate new samples from it similar in spirit to autoencoders and energy-based models.

A framework for training generative models to select samples from a desired distribution is called a Generative Stochastic Network (GSN). Because such machines may be made to be trained by back-propagation, this method has advantages.

Mechanism and Purpose of Generative Stochastic Networks

A parameterised Markov chain is defined by Generative Stochastic Networks. In other words, they discover the parameters of a machine that executes a single generative Markov chain step.

  • They can be thought of as defining this parameterised Markov chain and expand generalised denoising auto-encoders.
  • The objective is to define a probability distribution that may be used to draw samples in an implicit manner.
  • Generative Stochastic Networks are made to be back-propagation trainable. This puts them in line with popular deep learning methods.

Connection to Autoencoders

Generative Stochastic Networks are associated with noisy auto-encoders that use data distribution sampling to train the transition operator of a Markov chain.

Denoising autoencoders are driven by learning representations that are resilient to minor, irrelevant input changes. They can be thought of as defining and learning a manifold. Their goal is to reverse a corruption process. It is also possible to view them as belonging to a semi-parametric model that can be used to generate samples. Maximising a variational constraint on a particular generative model is the same as training a denoising autoencoder.

Generative Stochastic Networks

Comparison to Other Models

Generative Adversarial Networks (GANs): The adversarial nets framework does not require a Markov chain for sampling, in contrast to Generative Stochastic Networks. Additionally, because GANs do not require feedback loops during generation, they are better able to utilise piecewise linear units, hence avoiding problems with unbounded activation.

Directed Graphical Models (such as RBMs and DBNs) with Latent Variables: During maximum likelihood estimation, these models frequently require unmanageable probabilistic calculations, which GSNs seek to avoid. Back-propagation is used to train GSNs, in contrast to many RBM-based models.

Graphical models that are not directed, such as RBMs and DBMs: These models frequently have intractable partition functions, and Markov Chain Monte Carlo (MCMC) methods—which may have mixing issues are used to estimate their gradients. In contrast, GSNs do not require Markov chains for post-training sampling because they are trained using back-propagation.

Other Generative Models: Recent instances of back-propagating into a generative machine for training include stochastic backpropagation and Auto-Encoding Variational Bayes (AEVB), in addition to Generative Stochastic Networks. For example, AEVB uses stochastic gradient techniques to focus on effective inference and learning in directed probabilistic models with continuous latent variables and intractable posterior distributions.

Advantages of Generative Stochastic Networks

No Need to Compute Partition Function

GSNs are simpler and quicker to train than energy-based models (like RBMs) since they do not have to deal with the intractable partition function.

Extends Autoencoders with denoising
By extending denoising autoencoders to develop generative models, GSNs are able to sample fresh data in addition to reconstructing it.

Conceptual Promises
Provable consistency is provided by the convergence of the Markov chain specified by the GSN to the data distribution under specific conditions.

Modular and Flexible

Modular and Flexible GSNs enable modular design, which enables the plugging in of different architectures (such as CNNs and deep neural networks).

Acquiring Knowledge Without Explicit Probability
Instead of calculating log-likelihoods, which is helpful when likelihoods are difficult to define or calculate, GSNs learn by sampling and reconstruction.

Disadvantages of Generative Stochastic Networks

Training is Still Challenging

Even though GSNs are easier to train than RBMs or DBNs, the network architecture and noise model still need to be carefully planned.

Sampling Slowly
Compared to VAEs or GANs, GSNs generate more slowly because they rely on a Markov chain, which may require a lot of steps to mix properly.

Limited Tooling and Adoption
GSNs are less popular than GANs and VAEs, which translates to fewer libraries, examples, and community support.

Corruption Process Sensitivity
Selecting an efficient corruption (noise) process is essential to the model’s success. Bad decisions might hinder convergence or degrade performance.

Scaling to high dimensions is more difficult.
Similar to many early generative models, GSNs may have trouble processing complex or high-dimensional input, particularly when contrasted with more recent models such as transformers or diffusion models.

Index