Layers

Remember that we are working with tensors, and so we need to relate this data back to those data formats. A single image can be a 28 x 28 matrix, or it can be a 784 value long vector. Our labels are currently integers from 0 to 9. However, as these are really categorical values—not a continuous numerical value from 0 to 9—it is best if we turn the results into a vector. Instead of requiring our model to produce this outright, we should think of the output as a vector of 10 values, with a 1 in the position telling us which digit it thinks it is.

This gives us the parameters that we are working with; we have to input 784 values, and then get 10 values out of our trained network. For this example, we are constructing our layers as per the following diagram:

This structure would typically be described as a network with two hidden layers of 300 and 100 units each. This can be implemented in Gorgonia with the following code:

type nn struct {
    g *gorgonia.ExprGraph
    w0, w1, w2 *gorgonia.Node

    out *gorgonia.Node
    predVal gorgonia.Value
}

func newNN(g *gorgonia.ExprGraph) *nn {
    // Create node for w/weight
    w0 := gorgonia.NewMatrix(g, dt, gorgonia.WithShape(784, 300), gorgonia.WithName("w0"), gorgonia.WithInit(gorgonia.GlorotN(1.0)))
   w1 := gorgonia.NewMatrix(g, dt, gorgonia.WithShape(300, 100), gorgonia.WithName("w1"), gorgonia.WithInit(gorgonia.GlorotN(1.0)))
    w2 := gorgonia.NewMatrix(g, dt, gorgonia.WithShape(100, 10), gorgonia.WithName("w2"), gorgonia.WithInit(gorgonia.GlorotN(1.0)))

    return &nn{
        g: g,
        w0: w0,
        w1: w1,
        w2: w2,
    }
}

We are also using the ReLU activation function you learned about in Chapter 2, What Is a Neural Network and How Do I Train One?. As it turns out, ReLU is well suited for this task. So, a forward pass of our network looks like the following:

func (m *nn) fwd(x *gorgonia.Node) (err error) {
    var l0, l1, l2 *gorgonia.Node
    var l0dot, l1dot*gorgonia.Node

    // Set first layer to be copy of input
    l0 = x

    // Dot product of l0 and w0, use as input for ReLU
    if l0dot, err = gorgonia.Mul(l0, m.w0); err != nil {
        return errors.Wrap(err, "Unable to multiply l0 and w0")
    }

    // Build hidden layer out of result
    l1 = gorgonia.Must(gorgonia.Rectify(l0dot))

    // MOAR layers

    if l1dot, err = gorgonia.Mul(l1, m.w1); err != nil {
        return errors.Wrap(err, "Unable to multiply l1 and w1")
    }
    l2 = gorgonia.Must(gorgonia.Rectify(l2dot))

    var out *gorgonia.Node
    if out, err = gorgonia.Mul(l2, m.w2); err != nil {
        return errors.Wrapf(err, "Unable to multiply l2 and w2")
    }

    m.out, err = gorgonia.SoftMax(out)
    gorgonia.Read(m.out, &m.predVal)
    return
}

You can see that our network's final output is passed to the Gorgonia SoftMax function. This squashes our outputs to a sum of 1 by rescaling all the values to a value between 0 and 1. This is useful as we are using ReLU activation units, which can go into very large numbers. We want an easy way to keep our values as close as possible to our labels, which look something like the following:

[ 0.1 0.1 0.1 1.0 0.1 0.1 0.1 0.1 0.1 ]

A model trained by SoftMax will produce values that are like this:

[ 0 0 0 0.999681 0 0.000319 0 0 0 0 ]

By taking the element of this vector with the maximum value, we can see that the predicted label is 4.

Table of Contents for Layers

Create new playlist

Sign In

Sign Up

Table of Contents for
Layers