To keep discussion of artificial neurons as straightforward as possible, in this book we used a shorthand notation to identify them within a network. In this appendix, we lay out a more widely used formal notation, which may be of interest if you’d like to:

Possess a more precise manner for describing neurons

Follow closely the backpropagation technique covered in Appendix B

Taking a look back at Figure 7.1, the neural network has a total of four layers. The first is the input layer, which can be thought of as a collection of starting blocks for each data point to enter the network. In the case of the MNIST models, for example, there are 784 such starting blocks, representing each of the pixels in a 28×28–pixel handwritten MNIST digit. No computation happens within an input layer; it simply holds space for the input values to exist in so that the network knows how many values it needs to be ready to compute on in the next layer.^{1}

1. For this reason, we usually don’t need a means to address a particular input neuron; they have no weights or biases.

The next two layers in the network in Figure 7.1 are hidden layers, in which the bulk of the computation within a neural network occurs. As we’ll soon discuss, the input values * x* are mathematically transformed and combined by each neuron in the hidden layer, outputting some activation value

Because Figure 7.1 is a dense network, the neuron ${{a}}_{{1}}^{{1}}$ receives inputs from all of the neurons in the preceding layer, namely the network inputs **x**_{1} and **x**_{2}. Each neuron has its own bias, * b*, and we’ll label that bias in exactly the same manner as the activation

The green arrows in Figure 7.1 represent the mathematical transformation that takes place during forward propagation, and each green arrow has its own individual weight associated with it. In order to refer to these weights directly, we employ the following notation: ${{w}}_{{(}{1}{,}{2}{)}}^{{1}}$ is the weight in the *first* hidden layer (superscript) that connects neuron ${{a}}_{{1}}^{{1}}$ to its input **x**_{2} in the input layer (subscript). This double-barreled subscript is necessary because the network is fully connected: Every neuron in a layer is connected to every neuron in the layer before it, and that connection carries its own weight. Let’s generalize this weight notation:

The superscript is the hidden-layer number of the input-receiving neuron.

The first subscript is the number of the neuron receiving the input within its hidden layer.

The second subscript is the number of the neuron providing input from the preceding layer.

As a further example, the weight for neuron ${{\mathit{a}}}_{{2}}^{{2}}$ will be denoted ${{\mathit{w}}}_{{\mathit{(}}{2}{,}{i}{\mathit{)}}}^{{2}}$ where * i* is a neuron in the preceding layer.

At the far right of the network, we finally have the output layer. As with the hidden layers, output-layer neurons have weights and a bias, and these are labeled in the same way.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.