Chapter 4. Neural Networks

So far, we've looked at two of the most well-known methods used for predictive modeling. Linear regression is probably the most typical starting point for problems where the goal is to predict a numerical quantity. The model is based on a linear combination of input features. Logistic regression uses a nonlinear transformation of this linear feature combination in order to restrict the range of the output in the interval [0,1]. In so doing, it predicts the probability that the output belongs to one of two classes. Thus, it is a very well-known technique for classification.

Both methods share the disadvantage that they are not robust when dealing with many input features. In addition, logistic regression is typically used for the binary classification problem. In this chapter, we will introduce the concept of neural networks, a nonlinear approach to solving both regression and classification problems. They are significantly more robust when dealing with a higher dimensional input feature space, and for classification, they possess a natural way to handle more than two output classes.

Neural networks are a biologically inspired model, the origins of which date back to the 1940s. Interest in neural networks has fluctuated greatly over the years as the first models proved to be quite limited compared to the expectations at the time. Additionally, training a large neural network requires substantial computational resources. Recently, there has been a huge surge in interest in neural networks as distributed on-demand computing resources are now widespread and an important area of machine learning, known as deep learning, is already showing great promise. For this reason, it is a great time to be learning about this type of model.

The biological neuron

Neural network models draw their analogy from the organization of neurons in the human brain, and for this reason they are also often referred to as artificial neural networks (ANNs) to distinguish them from their biological counterparts. The key parallel is that a single biological neuron acts as a simple computational unit, but when a large number of these are combined together, the result is an extremely powerful and massively distributed processing machine capable of complex learning, known more commonly as the human brain. To get an idea of how neurons are connected in the brain, the following image shows a simplified picture of a human neural cell:

The biological neuron

In a nutshell, we can think of a human neuron as a computational unit that takes in a series of parallel electrical signal inputs known as synaptic neurotransmitters coming in from the dendrites. The dendrites transmit signal chemicals to the soma or body of the neuron in response to the received synaptic neurotransmitters. This conversion of an external input signal to a local signal can be thought of as a process in which the dendrites apply a weight (which can be negative or positive depending on whether the chemicals produced are inhibitors or activators respectively) to their inputs.

The soma of the neuron, which houses the nucleus or central processor, mixes these input signals in a process that can be thought of as summing up all the signals. Consequently, the original dendrite inputs are basically transformed into a single linear weighted sum. This sum is sent to the axon of the neuron, which is the transmitter of the neuron. The weighted sum of electrical inputs creates an electric potential in the neuron, and this potential is processed in the axon by means of an activation function, which determines whether the neuron will fire.

Typically, the activation function is modeled as a switch that requires a minimum electrical potential, known as the bias, to be reached before it is turned on. Thus, the activation function essentially determines whether the neuron will output an electrical signal or not, and if so, the signal is transported through the axon and propagated to other neurons through the axon terminals. These, in turn, connect to the dendrites of neighboring neurons and the electrical signal output becomes an input to subsequent neural processing.

This description is, of course, a simplification of what happens in our neurons, but the goal here is to explain what aspects of the biological process have been used to inspire the computational model of a neural network.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.202.177