- Introduction to Artificial Neural Network
- Types of Neural Networks
This blog is custom tailored to aid your understanding on different types of commonly used neural networks, how they work and their industry applications. The blog commences with a brief introduction on the working of neural networks. We have tried to keep it very simple yet effective.
An Introduction to Artificial Neural Network
Neural networks represent deep learning using artificial intelligence. Certain application scenarios are too heavy or out of scope for traditional machine learning algorithms to handle. As they are commonly known, Neural Network pitches in such scenarios and fills the gap.
Artificial neural networks are inspired from the biological neurons within the human body which activate under certain circumstances resulting in a related action performed by the body in response. Artificial neural nets consist of various layers of interconnected artificial neurons powered by activation functions which help in switching them ON/OFF. Like traditional machine algorithms, here too, there are certain values that neural nets learn in the training phase.
Briefly, each neuron receives a multiplied version of inputs and random weights which is then added with static bias value (unique to each neuron layer), this is then passed to an appropriate activation function which decides the final value to be given out of the neuron. There are various activation functions available as per the nature of input values. Once the output is generated from the final neural net layer, loss function (input vs output)is calculated and backpropagation is performed where the weights are adjusted to make the loss minimum. Finding optimal values of weights is what the overall operation is focusing around. Please refer to the following for better understanding-
Weights are numeric values which are multiplied with inputs. In backpropagation, they are modified to reduce the loss. In simple words, weights are machine learnt values from Neural Networks. They self-adjust depending on the difference between predicted outputs vs training inputs.
Activation Function is a mathematical formula which helps the neuron to switch ON/OFF.
- Input layer represents dimensions of the input vector.
- Hidden layer represents the intermediary nodes that divide the input space into regions with (soft) boundaries. It takes in a set of weighted input and produces output through an activation function.
- Output layer represents the output of the neural network.
Types of Neural Networks
There are many types of neural networks available or that might be in the development stage. They can be classified depending on their: Structure, Data flow, Neurons used and their density, Layers and their depth activation filters etc.
We are going to discuss the following neural networks:
Perceptron model, proposed by Minsky-Papert is one of the simplest and oldest models of Neuron. It is the smallest unit of neural network that does certain computations to detect features or business intelligence in the input data. It accepts weighted inputs, and apply the activation function to obtain the output as the final result. Perceptron is also known as TLU(threshold logic unit)
Perceptron is a supervised learning algorithm that classifies the data into two categories, thus it is a binary classifier. A perceptron separates the input space into two categories by a hyperplane represented by the following equation
Advantages of Perceptron
Perceptrons can implement Logic Gates like AND, OR, or NAND
Disadvantages of Perceptron
Perceptrons can only learn linearly separable problems such as boolean AND problem. For non-linear problems such as boolean XOR problem, it does not work.
B. Feed Forward Neural Networks
Applications on Feed Forward Neural Networks:
- Simple classification (where traditional Machine-learning based classification algorithms have limitations)
- Face recognition [Simple straight forward image processing]
- Computer vision [Where target classes are difficult to classify]
- Speech Recognition
The simplest form of neural networks where input data travels in one direction only, passing through artificial neural nodes and exiting through output nodes. Where hidden layers may or may not be present, input and output layers are present there. Based on this, they can be further classified as a single-layered or multi-layered feed-forward neural network.
Number of layers depends on the complexity of the function. It has uni-directional forward propagation but no backward propagation. Weights are static here. An activation function is fed by inputs which are multiplied by weights. To do so, classifying activation function or step activation function is used. For example: The neuron is activated if it is above threshold (usually 0) and the neuron produces 1 as an output. The neuron is not activated if it is below threshold (usually 0) which is considered as -1. They are fairly simple to maintain and are equipped with to deal with data which contains a lot of noise.
Advantages of Feed Forward Neural Networks
- Less complex, easy to design & maintain
- Fast and speedy [One-way propagation]
- Highly responsive to noisy data
Disadvantages of Feed Forward Neural Networks:
- Cannot be used for deep learning [due to absence of dense layers and back propagation]
C. Multilayer Perceptron
Applications on Multi-Layer Perceptron
- Speech Recognition
- Machine Translation
- Complex Classification
An entry point towards complex neural nets where input data travels through various layers of artificial neurons. Every single node is connected to all neurons in the next layer which makes it a fully connected neural network. Input and output layers are present having multiple hidden Layers i.e. at least three or more layers in total. It has a bi-directional propagation i.e. forward propagation and backward propagation.
Inputs are multiplied with weights and fed to the activation function and in backpropagation, they are modified to reduce the loss. In simple words, weights are machine learnt values from Neural Networks. They self-adjust depending on the difference between predicted outputs vs training inputs. Nonlinear activation functions are used followed by softmax as an output layer activation function.
Advantages on Multi-Layer Perceptron
- Used for deep learning [due to the presence of dense fully connected layers and back propagation]
Disadvantages on Multi-Layer Perceptron:
- Comparatively complex to design and maintain
Comparatively slow (depends on number of hidden layers)
D. Convolutional Neural Network
Applications on Convolution Neural Network
- Image processing
- Computer Vision
- Speech Recognition
- Machine translation
Convolution neural network contains a three-dimensional arrangement of neurons, instead of the standard two-dimensional array. The first layer is called a convolutional layer. Each neuron in the convolutional layer only processes the information from a small part of the visual field. Input features are taken in batch-wise like a filter. The network understands the images in parts and can compute these operations multiple times to complete the full image processing. Processing involves conversion of the image from RGB or HSI scale to grey-scale. Furthering the changes in the pixel value will help to detect the edges and images can be classified into different categories.
Propagation is uni-directional where CNN contains one or more convolutional layers followed by pooling and bidirectional where the output of convolution layer goes to a fully connected neural network for classifying the images as shown in the above diagram. Filters are used to extract certain parts of the image. In MLP the inputs are multiplied with weights and fed to the activation function. Convolution uses RELU and MLP uses nonlinear activation function followed by softmax. Convolution neural networks show very effective results in image and video recognition, semantic parsing and paraphrase detection.
Advantages of Convolution Neural Network:
- Used for deep learning with few parameters
- Less parameters to learn as compared to fully connected layer
Disadvantages of Convolution Neural Network:
- Comparatively complex to design and maintain
- Comparatively slow [depends on the number of hidden layers]
E. Radial Basis Function Neural Networks
Radial Basis Function Network consists of an input vector followed by a layer of RBF neurons and an output layer with one node per category. Classification is performed by measuring the input’s similarity to data points from the training set where each neuron stores a prototype. This will be one of the examples from the training set.
When a new input vector [the n-dimensional vector that you are trying to classify] needs to be classified, each neuron calculates the Euclidean distance between the input and its prototype. For example, if we have two classes i.e. class A and Class B, then the new input to be classified is more close to class A prototypes than the class B prototypes. Hence, it could be tagged or classified as class A.
Each RBF neuron compares the input vector to its prototype and outputs a value ranging which is a measure of similarity from 0 to 1. As the input equals to the prototype, the output of that RBF neuron will be 1 and with the distance grows between the input and prototype the response falls off exponentially towards 0. The curve generated out of neuron’s response tends towards a typical bell curve. The output layer consists of a set of neurons [one per category].
Application: Power Restoration
a. Powercut P1 needs to be restored first
b. Powercut P3 needs to be restored next, as it impacts more houses
c. Powercut P2 should be fixed last as it impacts only one house
F. Recurrent Neural Networks
Applications of Recurrent Neural Networks
- Text processing like auto suggest, grammar checks, etc.
- Text to speech processing
- Image tagger
- Sentiment Analysis
Designed to save the output of a layer, Recurrent Neural Network is fed back to the input to help in predicting the outcome of the layer. The first layer is typically a feed forward neural network followed by recurrent neural network layer where some information it had in the previous time-step is remembered by a memory function. Forward propagation is implemented in this case. It stores information required for it’s future use. If the prediction is wrong, the learning rate is employed to make small changes. Hence, making it gradually increase towards making the right prediction during the backpropagation.
Advantages of Recurrent Neural Networks
- Model sequential data where each sample can be assumed to be dependent on historical ones is one of the advantage.
- Used with convolution layers to extend the pixel effectiveness.
Disadvantages of Recurrent Neural Networks
- Gradient vanishing and exploding problems
- Training recurrent neural nets could be a difficult task
- Difficult to process long sequential data using ReLU as an activation function.
Improvement over RNN: LSTM (Long Short-Term Memory) Networks
LSTM networks are a type of RNN that uses special units in addition to standard units. LSTM units include a ‘memory cell’ that can maintain information in memory for long periods of time. A set of gates is used to control when information enters the memory when it’s output, and when it’s forgotten. There are three types of gates viz, Input gate, output gate and forget gate. Input gate decides how many information from the last sample will be kept in memory; the output gate regulates the amount of data passed to the next layer, and forget gates control the tearing rate of memory stored. This architecture lets them learn longer-term dependencies
This is one of the implementations of LSTM cells, many other architectures exist.
G. Sequence to sequence models
A sequence to sequence model consists of two Recurrent Neural Networks. Here, there exists an encoder that processes the input and a decoder that processes the output. The encoder and decoder work simultaneously – either using the same parameter or different ones. This model, on contrary to the actual RNN, is particularly applicable in those cases where the length of the input data is equal to the length of the output data. While they possess similar benefits and limitations of the RNN, these models are usually applied mainly in chatbots, machine translations, and question answering systems.
H. Modular Neural Network
Applications of Modular Neural Network
- Stock market prediction systems
- Adaptive MNN for character recognitions
- Compression of high level input data
A modular neural network has a number of different networks that function independently and perform sub-tasks. The different networks do not really interact with or signal each other during the computation process. They work independently towards achieving the output.
As a result, a large and complex computational process are done significantly faster by breaking it down into independent components. The computation speed increases because the networks are not interacting with or even connected to each other.
Advantages of Modular Neural Network
- Independent training
Disadvantages of Modular Neural Network
- Moving target Problems