Neural Networks

In session #5 of the DataX TTT series, 'Neural Networks" were discussed.

Neural Networks are computational models that mimic the complex functions of the human brain.
Yeah! Neural Networks are the backbone of many AI systems today, helping to solve complex tasks like image recognition, language processing, and even autonomous driving! With layers of interconnected nodes, they learn from data and improve over time, making them powerful tools for solving real-world problems.

Data + Results = Algorithms
 
  • Like
Reactions: Kwabena Fred

Neural networks​

There are three different layers defined in NN. The first layer is the input layer. The input layer is passive in that it does no processing but only holds the data as a feature vector supplied to the first hidden layer. The hidden layers are one or more layers that do the processing but are not input or output layers. Each neuron in the first hidden layer takes all the input attributes, multiplies them with corresponding weights, adds a bias, and transforms the data using a nonlinear function. The weights for a given hidden neuron are randomly initialized, and all the neurons in the hidden layer will have weights associated with them (Sarker, 2021). A shallow NN is an architecture that contains a single hidden layer. Training and processing of a shallow NN is significantly faster when compared to a more complex deep NN because there are fewer configuration parameters. Due to their simple structure and nature, however, shallow models are limited in capturing complex features and patterns in datasets, offering a lower overall functionality on complex tasks with high-dimensional data. Deep NNs, in contrast, have several hidden layers, making them more complex structures. As a more complex structure, there is a greater requirement for computational power, and often, more time is spent training the model as there are more parameters. Deep NN is capable of learning and extracting hierarchical features from the data (Cirrincione et al., 2020).

Artificial neural networks​

An artificial neural network (ANN) are ML algorithms that act like the human brain, emulating how biological neurons transmit information to each other. ANNs comprise node layers containing an input layer and one or more hidden layers in an output layer. The input layer is passive and does not process data; its purpose is to provide the data to the first hidden layer. Each node connects to another node and has a linked weight and tolerance value for activation. Each hidden layer takes the input, multiplies the weight, and adds the bias. The output is transformed using a nonlinear function. If the output of any individual node is above the specified tolerance or threshold value, that node is activated and sends data to the next layer in the network. If the threshold is not met, the data is not passed to the next layer in the network (Erl, 2016). Once the ANN model has been trained and tested, it can be deployed to detect and monitor security threats within the system. The output of a classification problem will be binary, corresponding to a yes or no value, while the output for a regression problem will be a real number. The advantage of using an ANN is that it can quickly parse through large datasets, leading to faster detection and response times that increase an organization’s overall security posture (Sugumaran et al., 2023).

An artificial neural network (ANN) is a feed-forward NN and the most basic NN architecture where neuron connections do not form a cycle. ANNs are considered the building blocks for most other NNs. The output value of each hidden neuron is sent to each output neuron in the output layer. The output for a classification problem will be Boolean, yes/no, and the output for a regression problem will be a real number. ANNs can be considered a combination of linear and nonlinear equations trained on a dataset to produce the output. ANNs will learn the underlying relation between the independent variables as input and the dependent variables providing the output. During the training phase of the ANN, weights are assigned to each connection between neurons, with the weights being learnable parameters that are iteratively updated to find the optimal values. ANN are typically shallow networks containing only a single hidden layer (Sarker, 2021).

Convolutional neural network​

Convolutional neural networks (CNNs) use a variation of multilayer perceptron and contain one or more convolutional layers that can be entirely connected or pooled. The major advantage of using CNNs over ANNs is that the discriminative architecture learns directly from the input without requiring human feature extraction. This is because CNNs are deep networks containing multiple hidden layers. CNNs are specifically intended to deal with 2-dimensional shapes and are frequently used in applications such as visual recognition, medical image analysis, and image segmentation. The convolutional layers create feature maps that record a region of an image, which is ultimately broken into rectangles and sent out for nonlinear processing. Each layer considers the optimal parameters for a meaningful output as well as parsimony (Sarker, 2021).

Recurrent neural networks​

Recurrent neural networks (RNNs) are more complex forms of NNs. RNNs take advantage of reinforcement Learning (RL), a learning approach in which an AI agent interacts with its surrounding environment by trial-and-error method and learns an optimal behavioral strategy based on the reward signals received from previous interactions. This learning process of the RL agent mimics the human or animal learning approach. RL has emerged as an efficient technique for solving complicated sequential decision-making tasks (Shakya et al., 2023). In addition to forward propagation, RNNs contain backpropagation that updates the NN weights and biases. Each neuron of the model acts as a memory cell containing the computation and implementation of operations. If the network’s prediction is incorrect, the RNN self-learns and continues working toward the correct prediction during backpropagation (Sarker, 2021).

Cirrincione, G., Kumar, R. R., Mohammadi, A., Kia, S. H., Barbiero, P., & Ferretti, J. (2020). Shallow Versus Deep Neural Networks in Gear Fault Diagnosis. IEEE Transactions on Energy Conversion, Energy Conversion, IEEE Transactions on, IEEE Trans. Energy Convers., 35(3), 1338–1347. IEEE Xplore Digital Library. https://doi.org/10.1109/TEC.2020.2978155

Sarker, I. H. (2021). Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions. SN Computer Science, 2(6), 420. https://doi.org/10.1007/s42979-021-00815-1

Shakya, A. K., Pillai, G., & Chakrabarty, S. (2023). Reinforcement learning algorithms: A brief survey. Expert Systems with Applications, 231, 120495. https://doi.org/10.1016/j.eswa.2023.120495

Sugumaran, D., Mahaboob John, Y. M., Mary C, J. S., Joshi, K., Manikandan, G., & Jakka, G. (2023). Cyber Defence Based on Artificial Intelligence and Neural Network Model in Cybersecurity. 2023 Eighth International Conference on Science Technology