Artificial Neural Network (ANN)

This paper focuses on data-driven hydrological model i.e., Artificial Neural Network (ANN). Performance of ANN models depend on its architecture i.e., patterns of connections between the nodes, weights assigned to each of the connecting nodes, number of layers, activation function and learning rates. (Source: Flood forecasting using Artificial Neural Network, UNESCO-IHE, Pichaid Varoonchotikul, August 2003).

1.2 Purpose: Hydrologists tried many models to predict flood accurately but due to uncertain changes the problem of inaccurate forecast always remains. So, a new method known as the data-driven approach is used for flood prediction. The main purpose of this paper is to give an overview of using the ANN technique in Hydrological Modelling, prediction of water level for flood forecasting using previous data. Since river water fluctuation is highly non-linear in nature so to solve this non-linear problem MLP (Multi-Layer Perceptron) based ANN model is made using a multi-paradigm numerical computing environment and proprietary programming language like python, MATLAB, programming R, etc.

1.3 Scope: This report focuses on the understanding of Data-driven Artificial Neural Network & its applications. Then, it focuses on the ANN Architecture model using historical data-set. Further, it focuses on the approach to forecast flood in a river or lakes by predicting the next day water level. It covers the mathematical methodology required to run the ANN network and its performance assessment.

1. Biological Neural Network: Artificial Neural Network in deep learning works based on the structure and functions of a human brain. Human brains consist of many neurons that process and transmit the information between themselves.

Figure 1: Representation of biological Neural Network. (Source: https://www.quora.com/How-do-I-learn-neural-networks) A typical cell consists of Four major segments namely, Dendrites, cell Nucleus, Axon and Synapse. Cell nucleus receives the signals in the form of electrical pulses from dendrites (a tree-like nerve fibers, normally 103 to 104 dendrites consists per person). These dendrites connect the other dendrites of different neurons. The cell nucleus processes the received signals and produces the output signal. The output signal is transmitted via axon to the synapses which are the interface between axon and the input dendrites of another neuron. (Source: Flood forecasting using Artificial Neural Network, UNESCO-IHE, Pichaid Varoonchotikul, page-11).

2. ANN Architecture: The term neural network was derived from the work of Warren S. McCulloch and Walter Pitts. These networks consist of artificial neurons called nodes that process information and perform operations. There are 3 layers present in a neural network:

Figure 2: A

typical ANN Node (Source: Flood forecasting using Artificial Neural Network, UNESCO-IHE, Pichaid Varoonchotikul). Input Layers: This layer takes large volumes of input data in the form of texts, numbers, audio files, image pixels, etc. Hidden Layers: Hidden layers are responsible to perform the mathematical operation, pattern analysis, feature extraction, etc. There can multiple hidden layers in a neural network. Output Layer: Output layer is usually used to be a single node which receives several inputs from different hidden nodes and generates the desirable output. (Source: http://www.umsl.edu/~piccininig/First_Computational_Theory_of_Mind_and_Brain.pdf)

3.1 ANN Parameters: An artificial neural network consists of several parameters and hyperparameters that drive the output of a neural network model. Some of these parameters are weights, biases, number of epochs, the learning rate, batch size, number of batches, etc.

a) Weights: Weights are one of the important parameters in ANN which represents the strength between the nodes. For example, if node 1 has a greater value of weight than Node 2 then, it means Node 2 has less influence than Node 1. Nodes with the value of weight zero mean changing of input will not change the output. (Source: Neural Network Weight – an overview, ScienceDirect Topics, 2020)

b) Epochs: Epochs represents the number of times an algorithm needs to be trained for the dataset. It depends on the type of dataset and what kind of work we are doing. Things to be noted here is that if epochs are less in number than required than curve fit will be under fitted and if epochs provided are more in number then curve generated will be overfitted. So, what we required is the average number of epochs which gives the best fit curve. (Source: Epoch vs Batch Size vs Iterations, 2019)

c) Learning Rate: Learning rates determine the updating of the weight values and depending upon the value of learning rate we can manage how quickly and slowly we need to update our weights. The value of the learning rate should be average enough and it should be high enough so that it won’t take a long time to converge and it should be low enough so that it finds the local minima.  (Source: Everything you need to know about Neural Networks, 2017)

d) Activation Function: Activation functions decide which nodes to fire. There are various activation functions that are used for specific purposes based on the type of output you are looking for. Some of the activation functions are Sigmoid, Step (Threshold), ReLU (Rectifier), SoftMax, Hyperbolic Tangent function, etc. Output node takes input signals from many different nodes and depending upon these signals nature and intensities it generally performs two functions. Firstly, it combines all the signals and secondly, it transfers that combined signal to the activation function. This activation function depending upon the combined signal generates an output.

Overall, we can say that the activation function defines the output needed. The following are the types of activation function. (Source: Generalized MLP Architectures of Neural Networks, Bekir Karlik & A Vehbi Olgac).

• Linear Function

• Heaviside Function

• Sigmoid Function

Study Area: Montreal is situated near river St. Lawrence which links the Atlantic Ocean with the great lakes making one of the world’s most important commercial routes (Hydrological Analysis of the Historical May 2017 Flooding Event in Montreal and Surrounding Areas, Jmap_type=hisrward Back-Propagation

Method: Back-propagation is a supervised learning algorithm that is used for training neural networks. In this paper, I have used back-propagation to run the neural network. Backpropagation is one of the widely used algorithms in training feedforward neural networks mostly used in machine learning and deep learning for supervised learning. Backpropagation computes the error and corrects it by changing the weights of the network. It uses gradient methods for training multilayer networks by updating the weights to minimize loss. The backpropagation algorithm uses the chain rule by computing the gradient of the loss function with respect to each weight. It iterates from backward to the start and therefore computing the gradient one layer at a time to avoid redundant calculations. (https://en.wikipedia.org/wiki/Backpropagation).

Figure: A Feed-forward Backpropagation neural network (Source: https://github.com/codewrestling/Backpropagation/blob/master/Backpropagation.pdf) To run a backpropagation neural network, we need to initialize the weights by inserting values here, in this case, we have weights w1, w2, w3 & w4 (see above figure). Since there is the least probability that the assigned values of the weights lead to the predicted output therefore there always an error between predicted and actual output i.e.,                                                Now we need to use this calculated error and train our network backward from output going towards input and change the value of weights associated with each neuron. This way we get exact or value with minimum error. To do this an approach known as gradient descent method is used. This method will update the weights by reducing the error function to a minimum as possible.

Figure: A Gradient Descent graph (Source: https://github.com/codewrestling/Backpropagation/blob/master/Backpropagation.pdf)

Testing of Network: Following steps are done to complete the test: STEP-1: Importing the Data Data was exported into the MATLAB ANN toolbox into three categories i.e., input, output, and test variables. For input data 12*8 matrix form was created, for output 12*1 and for test a random form of 8*6 matrix was created.                            Figure: Import of input, output and test data into MATLAB ANN toolbox.   STEP-2: Creating ANN Model After exporting the data in the respective variable, I choose below ANN model consisting of 8 input columns, 10 hidden layers and 1 output layer for the best output result.

Figure: ANN model created in MATLAB STEP-3: Training of Network For the training of Network, I have set the Epoch value to 2000 and iteration set to 1000 numbers. After training the model, I got the best fit at 164 iteration which can find out from regression graph below where predicted values are seen linear with input data. Figure: Regression graph showing the training results

Figure: Training of Neural Network and its parameters RESULTS AND ANALYSIS: The focus of this paper is to show the use of Artificial Neural Network in flood prediction through prediction of water level and applying ANN model in MATLAB to compare the results of Observed and forecasted. After comparing the results, it can be concluded that Feed forward backpropagation method performed better. Figure: Observed and Forecasted mean monthly Water Level for the year 2018

Advantages of Artificial Neural Networks:

• Accurate results: In real-world many input-output relationships are complex and non-linear in nature and to solve these non-linear problems ANN has given the best results. (Source:  Maad M. Mijwil, January 2018)

• Self Organization: ANN does not impose any restriction on the number or nature of data and has the best ability to learn the hidden relationship between the data. Its use can be seen mostly in financial time series forecasting, for example, stock prices where the volatility of data can be seen very high. (Source:  Maad M. Mijwil, January 2018)

• Fault Tolerance: Due to some lack of information if some neuron stops working still the ANN can produce the output because of its parallel and distributed nature. ANN can produce the output even if there is missing one or more cells in the network. (Source:  Maad M. Mijwil, January 2018)

• Adaptive Learning: In real-world many input-output relationships are complex and non-linear in nature and to solve these non-linear problems ANN has given the best results.

• Real-time operations: ANN can change itself with the change in the environment and can learn in real-time. (Source:  Maad M. Mijwil, January 2018)

• Applications: ANN are used in various fields like image processing (for example face recognition), speech processing (for example speech to text), Healthcare/medicines (For example disease recognition like a brain tumor, computer-aided surgery), Defence (For example Unmanned aerial vehicle, automated target recognition, etc. (Source: Introduction to Neural Networks, Advantages and Applications. (2017, from https://towardsdatascience.com/introduction-to-neural-networks-advantages-and-applications-96851bd1a207).

Limitations and Recommendations for Improvement: After analyzing various sources, some of the limitations have been identified and suggestions for future improvements of the ANN made:

• Unexplained behavior: ANN can’t explain the procedure from feeding the data till getting the results. It is one of the main disadvantages of ANN which makes them unreliable.

• Network structure: There is no fixed method to find out the best architecture of the model and it is based on trial and error. This makes ANN more time-consuming.

• Mistakes in Coding: While performing a simple task coding for the network might not give a big impact but while doing for longer task any mistakes in the codes lead to error and one needs to work from the start. For this, I would recommend using ANN tools like one in MATLAB.

• Hyperparameters Tuning: Getting maximum accuracy from the Network depends on ANN\’s parameters and hyperparameters. Since choosing accurate parameters & hyperparameters takes various trial and error so I would recommend the following few rules:

1. Learning rate: Getting absolute minima from the network is an attentive task and one should select an optimum learning-rate because picking a very high learning-rate may not get work to absolute minima as there might be a chance of overshooting it. So, to get an absolute minimum from the network one needs to train the network many times which is time-absorbing. Therefore, learning-rates should need to be selected very carefully to make the network more effective. I recommend to first check pre-used learning rates depending on the work one is doing. (Source: Improving the Performance of a Neural Network- 2018)

2. Network Architecture: The Selection of best architecture for any kind of prediction in ANN is based on several trials. So, it is suggested to use proven architecture since there is no such standard architecture. (Source: Improving the Performance of a Neural Network- 2018)

3. Activation Function: Activation functions transform the input (which are non-linear in nature) to the desired linear product. It is one of the important variables in developing any network. Functions like sigmoid and Tanh are mostly used earlier but nowadays rectified linear unit (ReLU) is used mostly because of its capability to solve the problems of gradually vanishing gradients (i.e., gradient get diminish/decrease during backpropagation while reaching the starting layer). (Source: Improving the Performance of a Neural Network- 2018)

4. Data Management: While operating the neural network one needs to work on the huge number of data and therefore to make the network more effective data management is must required. It not only saves time, but it produces the best results. I would recommend handling the data on the excel sheet and categorize the data into different subcategories. (Source: Improving the Performance of a Neural Network- 2018, from https://towardsdatascience.com/how-to-increase-the-accuracy-of-a-neural-network-9f5d1c6f407d)