Layers

Remember that we are working with tensors, and so we need to relate this data back to those data formats. A single image can be a 28 x 28 matrix, or it can be a 784 value long vector. Our labels are currently integers from 0 to 9. However, as these are really categorical values—not a continuous numerical value from 0 to 9—it is best if we turn the results into a vector. Instead of requiring our model to produce this outright, we should think of the output as a vector of 10 values, with a 1 in the position telling us which digit it thinks it is.

This gives us the parameters that we are working with; we have to input 784 values, and then get 10 values out of our trained network. For this example, we are constructing our layers as per the following diagram:

This structure would typically be described as a network with two hidden layers of 300 and 100 units each. This can be implemented in Gorgonia with the following code:

type nn struct {
g *gorgonia.ExprGraph
w0, w1, w2 *gorgonia.Node

out *gorgonia.Node
predVal gorgonia.Value
}

func newNN(g *gorgonia.ExprGraph) *nn {
// Create node for w/weight
w0 := gorgonia.NewMatrix(g, dt, gorgonia.WithShape(784, 300), gorgonia.WithName("w0"), gorgonia.WithInit(gorgonia.GlorotN(1.0)))
w1 := gorgonia.NewMatrix(g, dt, gorgonia.WithShape(300, 100), gorgonia.WithName("w1"), gorgonia.WithInit(gorgonia.GlorotN(1.0)))
w2 := gorgonia.NewMatrix(g, dt, gorgonia.WithShape(100, 10), gorgonia.WithName("w2"), gorgonia.WithInit(gorgonia.GlorotN(1.0)))

return &nn{
g: g,
w0: w0,
w1: w1,
w2: w2,
}
}

We are also using the ReLU activation function you learned about in Chapter 2, What Is a Neural Network and How Do I Train One?. As it turns out, ReLU is well suited for this task. So, a forward pass of our network looks like the following:

func (m *nn) fwd(x *gorgonia.Node) (err error) {
var l0, l1, l2 *gorgonia.Node
var l0dot, l1dot*gorgonia.Node

// Set first layer to be copy of input
l0 = x

// Dot product of l0 and w0, use as input for ReLU
if l0dot, err = gorgonia.Mul(l0, m.w0); err != nil {
return errors.Wrap(err, "Unable to multiply l0 and w0")
}

// Build hidden layer out of result
l1 = gorgonia.Must(gorgonia.Rectify(l0dot))

// MOAR layers

if l1dot, err = gorgonia.Mul(l1, m.w1); err != nil {
return errors.Wrap(err, "Unable to multiply l1 and w1")
}
l2 = gorgonia.Must(gorgonia.Rectify(l2dot))

var out *gorgonia.Node
if out, err = gorgonia.Mul(l2, m.w2); err != nil {
return errors.Wrapf(err, "Unable to multiply l2 and w2")
}

m.out, err = gorgonia.SoftMax(out)
gorgonia.Read(m.out, &m.predVal)
return
}

You can see that our network's final output is passed to the Gorgonia SoftMax function. This squashes our outputs to a sum of 1 by rescaling all the values to a value between 0 and 1. This is useful as we are using ReLU activation units, which can go into very large numbers. We want an easy way to keep our values as close as possible to our labels, which look something like the following:

[ 0.1 0.1 0.1 1.0 0.1 0.1 0.1 0.1 0.1 ]

A model trained by SoftMax will produce values that are like this:

[ 0 0 0 0.999681 0 0.000319 0 0 0 0 ]

By taking the element of this vector with the maximum value, we can see that the predicted label is 4.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.235.176