What are the Radial Basis Function Networks?

One important kind of artificial neural network used in many machine learning applications is the Radial Basis Function Network (RBFN). They work especially well for categorization issues, time series prediction, and function approximation.

What are Radial Basis Function Networks?

An artificial neural network that uses radial basis functions (RBFs) as activation functions is known as a radial basis function network. These radial basis functions of the inputs and the neuron parameters are combined linearly to produce the output. One particular kind of feed-forward neural network is called an RBFN.

Their distinctive three-layer architecture, universal approximation capability, and quicker learning rate set them apart from other neural networks. The capacity of RBFNs to approximate intricate, nonlinear mappings from inputs to outputs is well-known.

History

Broomhead and Lowe introduced the Radial Basis Function Network (RBFN) in 1988. Some RBFN software products base their algorithm on the methodology put forward by Moody and Darken in 1989.

How RBFNs Work

The idea behind Radial Basis Function (RBF) networks is that an item’s predicted target value is influenced by surrounding items with comparable predictor variable values. Although they differ in how they are implemented, they share fundamental similarities with K-Nearest Neighbour (k-NN) models.

The operation of RBF Networks is broken down as follows:

Input Vector: An n-dimensional input vector that must be categorised or utilized for regression is first sent to the network.

Each RBF neuron in the network’s hidden layer represents a prototype vector taken from the training set. The Euclidean distance between the input vector and each neuron’s center is then calculated by the network.

Activation Function: A Radial Basis Function, usually a Gaussian function, is used to convert this computed Euclidean distance. The activation value of the neuron is calculated using this function. The fact that this activation value drops exponentially with increasing distance is one of its main features.
Output Nodes: Lastly, a score is determined by each output node in the network. The weighted sum of the activation levels from each RBF neuron serves as the basis for this score. The category with the highest score is selected for classification tasks.

Take a dataset with two-dimensional data points from two distinct classes as an example. Each neuron in an RBF network trained with 20 neurons would represent a prototype in the input space. 3-D mesh or contour plots can be used to visualize the category scores that the network calculates. Neurons from separate categories are given negative weights, whereas neurons from the same category are given positive weights in order to define these categories. These scores can then be evaluated on a grid to map the network’s decision boundary.

Also Read About Deep networks and Deep Networks Challenges

RBF Network Architecture

Radial Basis Function Networks Architecture — RBF Network Architecture

Three separate layers make up an RBFN, and each has a specialized function:

Input layer:

Function: Passes the initial input data straight to the hidden layer after receiving it as a vector of real numbers x ∈ ℝⁿ.
Components: Each predictor variable has one neuron. In this layer, a node’s output is identical to its input.

The Hidden Layer

The “heart” of the RBFN is the Function, which transforms the input space into a higher dimensional space in a non-linear fashion. You can also refer to its nodes as “pattern units”.
Components: Consists of either N or L neurons. There is a radial basis function linked to every neuron.
Procedure: A hidden neuron determines the Euclidean distance between an input vector (x) and its stored “center vector” (cₖ). The neuron’s output, vₖ = exp(-Iₖ² / (2σₖ²), is then obtained by transforming this distance (Iₖ) using a transfer function, usually a Gaussian function. The Gaussian function’s width or spread is denoted by the σₖ.

Output Layer

Function: Creates the final network output by combining the hidden layer’s outputs.
Linear neurons make up the components. The weighted total of the hidden layer activations is the output (φ(x), yₖ, or γ).
Procedure: A weighted sum of the hidden layer outputs vₖ, with weights wₖⱼ, is used to find the output yⱼ. This may be followed by another transfer function (such as a sigmoid for classification). The result of regression issues is a linear combination of values from the hidden layer.

Also Read About What is Deep Boltzmann Machines? and its Applications

Implementing the RBFN

There are usually two primary phases to training an RBFN:

Finding the Radial Basis Functions’ Centres (cᵢ or μ) and Spread Parameters (σ or ξ):

Unsupervised learning is frequently used to complete this stage.
The RBFs’ centers are determined by identifying cluster centers in the input space using clustering techniques such as the K-means algorithm. A suitable collection of L center vectors that minimize the sum of squares of the distance between training vectors and their closest centers is iteratively found using the adaptive K-means clustering algorithm.
The Gaussian function’s width (σₖ) can be determined using the P-nearest neighbour heuristic. It is computed as the root-mean-square (RMS) separation between a cluster center and the P centers that are closest to it.
This first stage, commonly referred to as the data-clustering phase, is when the input and hidden layer weights which stand in for the centers are established independently of the intended output.

Finding the Weights (aᵢ or wⱼ) of the Output Layer Connections:

A supervised learning technique is used to learn the weights that connect the hidden layer to the output layer after the centres and spreads have been fixed.
The linear least squares approach is frequently used for this. This can be discovered in one matrix operation for regression problems.
Once the K-means clustering process is finished, these output weights can likewise be trained using standard backpropagation training algorithms.
One common approach for determining the weights is the pseudoinverse method.

One important aspect of RBFN training is that only the cluster center nearest to the current training vector is altered during the data-clustering stage. The 1989 study by Moody and Darken was titled “Fast Learning in Networks of Locally-Tuned Processing Units” because of the significant speedup in network training caused by this localized tuning.

Features

Activation Function

The activation function of RBFNs is a radial basis function (RBF), most frequently the Gaussian function.

Three-Layer Architecture

An input layer, a hidden layer, and an output layer are the standard components of three-layer architectures.

Single Hidden Layer

Unlike backpropagation networks (MLPs), which can have numerous hidden layers and sigmoid or S-shaped activation functions, this system only has a single hidden layer with RBF activation.

Local Representation

Because RBF networks employ local basis functions, they are distinguished by local representation. The distance between an input and a prototype (center) determines the value of an RBF function. Hidden neurons will not be strongly activated if an input is far from all prototypes. This characteristic makes it possible for RBFNs to identify novel circumstances, something that traditional multilayer networks cannot do.

Dimensionality

The number of predictor variables is reflected in the network’s dimensions.

Center and Radius/Spread

Radius or spread (σ) and center (cᵢ or μ) are characteristics of every RBF neuron. Each neuron’s influence on the input space is influenced by the spread.

Training Paradigm

A supervised training algorithm is usually used to train RBFNs. Two steps are frequently included in the training process: supervised learning for output weights and unsupervised learning for centers/spreads.

Advantages

RBFNs have a number of benefits that make them appropriate for a range of uses:

Universal Approximation: If there are enough hidden neurons, RBFNs may estimate any continuous function to any desired level of precision.
Fast Training/Faster Learning Speed: When compared to other neural networks, particularly multi-layer perceptron (MLP) networks, RBFNs typically train more quickly. Part of the reason for this is that a quadratic error surface with a single, easily accessible minimum is produced by just adjusting the linear mapping from the hidden layer to the output layer.
Easier Design/Simpler Structure: RBF Networks’ uncomplicated three-layer topology facilitates its implementation and comprehension.
Good Generalisation: Their generalisation skills are good.
Low Extrapolation Errors: Radial-basis functions are frequently more dependable and typically have relatively low extrapolation errors.
No Local Minima Problem (for weights): RBF networks usually do not experience local minima problems during the linear mapping from hidden to output layer in the same manner as MLPs.
Interpretability: Because the radial basis functions of RBFNs contain distinct centers and spreads that characterize localized responses, their structure may be easier to understand than that of other neural networks.
High Input Noise Tolerance: They are capable of having a high input noise tolerance.

Also Read About Feedforward networks in Neural Network and its Types

Drawbacks and Challenges

Notwithstanding their benefits, RBFNs have several drawbacks:

Computational Resource consumption: When working with a large number of training samples, RBFNs’ significant computational resource consumption is one of its primary drawbacks.
Input Space Coverage Requirement: One drawback of RBF networks is that they need radial basis functions to provide adequate input space coverage. Their centers are chosen based on the distribution of the input data rather than the prediction goal, which may result in the waste of representational resources on unrelated locations.
Centre Selection and RBF Count: Network performance can be greatly impacted by the selection of centers and the quantity of radial basis functions. Overfitting or underfitting may result from poor selection. Assigning a center to every data point is a popular option; nevertheless, this might result in a massive linear system, necessitating the use of shrinkage techniques to prevent overfitting.
Fixed Spread (σ): It can be challenging to figure out the radial basis functions’ ideal spread (σ). The network may not capture data complexity if it is too big, or it may be excessively sensitive to noise if it is too tiny.
Scalability: As the number of inputs or centres rises, RBFNs may become computationally costly.
Performance in Classification vs. Prediction: Because of their low false-fault prediction rate and clearly defined fault boundaries, RBFNs are excellent at classification, but they perform badly in prediction tasks. They are unable to properly interpolate within unknown regions or extrapolate into untrained regions because their activation functions (such as Gaussian) provide outputs that fall to zero outside cluster centers. SVMs often perform better than RBF networks in the majority of classification applications. They can compete for comparatively low input space dimensionality in regression.

RBF Network Applications

Numerous fields have seen the beneficial application of RBFNs:

Approximating complex non-linear functions is made easier with the use of RBFNs.
Pattern Recognition and Classification: RBFNs are utilized for pattern recognition and data classification, including image and speech recognition, because of their classification capabilities. They are able to recognize novel circumstances.
Time Series Prediction: Because RBFNs are able to identify temporal dependencies in data, they can be used for modelling dynamic relationships, financial market forecasting, and weather prediction. Predicting solar radiation is one example.
In control systems, system control is utilized.
Regression: For prediction tasks, these networks are capable of modelling intricate relationships in data.

Comparison with Backpropagation Networks and Multi-Layer Perceptrons (MLPs):

Number of Hidden Layers

RBF networks normally feature only one hidden layer, whereas MLPs can have one or more hidden layers.

Activation Functions

While backpropagation networks usually employ sigmoid or S-shaped activation functions, RBF networks use radial basis functions (such as Gaussian) in their buried layer.

Training Speed

In general, RBF networks learn more quickly than MLPs. This is because, unlike MLPs, which employ iterative backpropagation for all layers, RBFNs allow for the training of the connections to the hidden layer (determining centers) without the need for desired output, and the training of output weights is a linear issue that can be solved more quickly.

Local Minima

Because RBF networks use a linear mapping with a quadratic error surface, they do not experience local minima during the weight adjustment process to the output layer, unlike multilayer perceptrons.

Extrapolation/Interpolation

Because of their more uniform output surfaces, MLPs (with sigmoid/hyperbolic tangent functions) are typically superior for prediction tasks requiring extrapolation or interpolation. Because of their localized responses, RBFNs are less appropriate for these tasks because they are likely to anticipate zero outside of trained regions.

Classification Performance

Because RBF networks have clearly defined fault boundaries and hardly ever anticipate spurious faults, they routinely perform better in classification than backpropagation networks when a representative training data set is available.

Also Read About Complete Guide to Introduction to Neural Networks

Page Content

Tutorials