The concept of artificial neural networks is rooted in biology with the idea of mimicking some of the brain's functions. Computer scientists thought that such a concept could be applied to the broader problem of parallel processing [10:1]. The key question in the 1970s was: how can we distribute the computation of tasks across a network or cluster of machines without having to program each machine? One simple solution consists of training each machine to execute the given tasks. The popularity of neural networks surged in the 1990s.
At its core, a neural network is a nonlinear statistical model that leverages the logistic regression to create a nonlinear distributed model.
In this chapter, you will move beyond the hype and learn the following:
The brain is a very powerful information processing engine that surpasses the reasoning ability of computers in domains such as learning, inductive reasoning, prediction, vision, and speech recognition. However, the simplest computing device has the capability to process very large datasets well beyond the ability of the human brain.
In biology, a neural network is composed of groups of neurons interconnected by synapses [10:2], as shown in the following image:
Neuroscientists have been especially interested in understanding how the billions of neurons in the brain can interact to provide human beings with parallel processing capabilities. The 1960s saw a new field of study emerging, known as connectionism. Connectionism marries cognitive psychology, artificial intelligence, and neuroscience. The goal was to create a model for mental phenomena. Although there are many forms of connectionism, the neural network models have become the most popular and the most taught of all connectionism models [10:3].
Biological neurons communicate through electrical charges known as stimuli. This network of neurons can be represented as a simple schematic, as follows:
This representation categorizes groups of neurons as layers. The terminology used to describe the natural neural networks has a corresponding nomenclature for the artificial neural network:
The biological neural network |
The artificial neuron network |
---|---|
Axon |
Connection |
Dendrite |
Connection |
Synapse |
Weight |
Potential |
Weighted sum |
Threshold |
Bias weight |
Signal, Stimulus |
Activation |
Group of neurons |
Layer of neurons |
In the biological world, stimuli do not propagate in any specific direction between neurons. An artificial neural network can have the same degree of freedom. The artificial neural networks most commonly used by data scientists have a predefined direction: from the input layer to output layers. These neural networks are known as feed-forward neural network or FFNN.
The previous chapter, Chapter 9, Regression and Regularization, describes the concept of the hyperplane, which segregates a set of labeled data points into distinct classes during training. The hyperplane is defined by the linear model (or margin) wT .x + w0 = 0. The linear regression can be visualized as a simple connectivity model using neurons and synapses, as follows:
The feature x0=+1 is known as the bias input (or bias element), which corresponds to the intercept in the classic linear regression.
As with support vector machines, linear regression is appropriate for observations that can be linearly separable. The real world is usually driven by a nonlinear phenomenon. Therefore, logistic regression is naturally used to compute the output of the perceptron. For a set of input variables x = {xi}0,n and the weights w={wi}1,n, the output y is computed as (M1):
Such an approach can be modeled as a FFNN, known as the multilayer perceptron [10:4].
An FFNN can be regarded as a stack of layers of logistic regression with the output layer as a linear regression. The value of the variables in each hidden layer is computed as the sigmoid of the dot product of the connection weights and the output of the previous layer. Although it's interesting, the theory behind artificial neural networks is beyond the scope of this book [10:5].
3.144.26.138