Building a single-layer regression model

Let's start by building a single-layer regression model based on the softmax activation function, as shown in the following diagram. As we have a single layer, Input to the neural network will be all the figure pixels, that is, 28 x 28 = 748 neurons. The number of Output neurons is 10, one for each digit. The network layers are fully connected, as shown in the following diagram:

A neural network is defined through a NeuralNetConfiguration.Builder() object as follows:

MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()

We will define the parameters for gradient search in order to perform iterations with the conjugate gradient optimization algorithm. The momentum parameter determines how fast the optimization algorithm converges to an local optimum. The higher the momentum, the faster the training; but higher speed can lower the model's accuracy:

.seed(seed) 
.gradientNormalization(GradientNormalization.ClipElementWiseAbsolu
   teValue) 
   .gradientNormalizationThreshold(1.0) 
   .iterations(iterations) 
   .momentum(0.5) 
   .momentumAfter(Collections.singletonMap(3, 0.9)) 
   .optimizationAlgo(OptimizationAlgorithm.CONJUGATE_GRADIENT)

Next, we will specify that the network has one layer and also define the error function, NEGATIVELOGLIKELIHOOD, internal perceptron activation function, softmax, and the number of input and output layers that correspond to the total
image pixels and the number of target variables, as shown in the following code block:

.list(1) 
.layer(0, new  
OutputLayer.Builder(LossFunction.NEGATIVELOGLIKELIHOOD) 
.activation("softmax") 
.nIn(numRows*numColumns).nOut(outputNum).build())

Finally, we will set the network to pretrain, disable backpropagation, and actually build the untrained network structure:

   .pretrain(true).backprop(false) 
   .build();

Once the network structure is defined, we can use it to initialize a new MultiLayerNetwork, as follows:

MultiLayerNetwork model = new MultiLayerNetwork(conf); 
model.init();

Next, we will point the model to the training data by calling the setListeners method, as follows:

model.setListeners(Collections.singletonList((IterationListener) 
   new ScoreIterationListener(listenerFreq)));

We will also call the fit(int) method to trigger end-to-end network training:

model.fit(iter);

To evaluate the model, we will initialize a new Evaluation object that will store batch results:

Evaluation eval = new Evaluation(outputNum);

We can then iterate over the dataset in batches in order to keep the memory consumption at a reasonable rate and store the results in an eval object:

DataSetIterator testIter = new MnistDataSetIterator(100,10000); 
while(testIter.hasNext()) { 
    DataSet testMnist = testIter.next(); 
    INDArray predict2 =  
    model.output(testMnist.getFeatureMatrix()); 
    eval.eval(testMnist.getLabels(), predict2); 
}

Finally, we can get the results by calling the stats() function:

log.info(eval.stats());

A basic one-layer model achieves the following accuracy:

    Accuracy:  0.8945 
    Precision: 0.8985
    Recall:    0.8922
    F1 Score:  0.8953

Getting 89.22% accuracy, that is, a 10.88% error rate, on the MNIST dataset is quite bad. We'll improve this by going from a simple one-layer network to the moderately sophisticated deep belief network using Restricted Boltzmann machines and a Multilayer Convolutional Network.

Table of Contents for Building a single-layer regression model

Create new playlist

Sign In

Sign Up

Table of Contents for
Building a single-layer regression model