Convolutional networks

CNNs are feed-forward networks modeled after the visual cortex found in animals. The visual cortex is arranged with overlapping neurons, and so in this type of network, the neurons are also arranged in overlapping sections, known as receptive fields. Due to their design model, they function with minimal preprocessing or prior knowledge, and this lack of human intervention makes them especially useful.

This type of network is used frequently in image and video recognition applications. They can be used for classification, clustering, and object recognition. CNNs can also be applied to text analysis by implementing Optical Character Recognition (OCR). CNNs have been a driving force in the machine learning movement in part due to their wide applicability in practical situations.

We are going to demonstrate a CNN using DL4J. The process will closely mirror the process we used in the Building an autoencoder in DL4J section. We will again use the Mnist dataset. This dataset contains image data, so it is well-suited to a convolutional network.

Building the model

First, we need to create a new DataSetIterator to process the data. The parameters for the MnistDataSetIterator constructor are the batch size, 1000 in this case, and the total number of samples to process. We then get our next dataset, shuffle the data to randomize, and split our data to be tested and trained. As we discussed earlier in the chapter, we typically use 65% of the data to train the data and the remaining 35% is used for testing:

DataSetIterator iter = new MnistDataSetIterator(1000,  
MnistDataFetcher.NUM_EXAMPLES); 
DataSet dataset = iter.next(); 
dataset.shuffle(); 
SplitTestAndTrain testAndTrain = dataset.splitTestAndTrain(0.65); 
DataSet trainingData = testAndTrain.getTrain(); 
DataSet testData = testAndTrain.getTest(); 

We then normalize both sets of data:

DataNormalization normalizer = new NormalizerStandardize(); 
normalizer.fit(trainingData); 
normalizer.transform(trainingData); 
normalizer.transform(testData); 

Next, we can build our network. As shown earlier, we will again use a MultiLayerConfiguration instance with a series of NeuralNetConfiguration.Builder methods. We will discuss the individual methods after the following code sequence. Notice that the last layer again uses the softmax activation function for regression analysis:

MultiLayerConfiguration.Builder builder = new    
          NeuralNetConfiguration.Builder() 
     .seed(123) 
     .iterations(1) 
     .regularization(true).l2(0.0005) 
     .weightInit(WeightInit.XAVIER)  
     .optimizationAlgo(OptimizationAlgorithm 
           .STOCHASTIC_GRADIENT_DESCENT) 
     .updater(Updater.NESTEROVS).momentum(0.9) 
     .list() 
     .layer(0, new ConvolutionLayer.Builder(5, 5) 
           .nIn(6) 
           .stride(1, 1) 
           .nOut(20) 
           .activation("identity") 
           .build()) 
     .layer(1, new SubsamplingLayer.Builder(
                SubsamplingLayer.PoolingType.MAX) 
           .kernelSize(2, 2) 
           .stride(2, 2) 
           .build()) 
     .layer(2, new ConvolutionLayer.Builder(5, 5) 
           .stride(1, 1) 
           .nOut(50) 
           .activation("identity") 
           .build()) 
     .layer(3, new SubsamplingLayer.Builder(
                SubsamplingLayer.PoolingType.MAX) 
           .kernelSize(2, 2) 
           .stride(2, 2) 
           .build()) 
     .layer(4, new DenseLayer.Builder().activation("relu") 
           .nOut(500).build()) 
     .layer(5, new OutputLayer.Builder(
                LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD) 
           .nOut(10) 
           .activation("softmax") 
           .build()) 
     .backprop(true).pretrain(false); 

The first layer, layer 0, which is duplicated next for your convenience, uses the ConvolutionLayer.Builder method. The input to a convolution layer is the product of the image height, width, and number of channels. In a standard RGB image, there are three channels. The nIn method takes the number of channels. The nOut method specifies that 20 outputs are expected:

.layer(0, new ConvolutionLayer.Builder(5, 5) 
        .nIn(6) 
        .stride(1, 1) 
        .nOut(20) 
        .activation("identity") 
        .build()) 

Layers 1 and 3 are both subsampling layers. These layers follow convolution layers and do no real convolution themselves. They return a single value, the maximum value for that input region:

.layer(1, new SubsamplingLayer.Builder( 
            SubsamplingLayer.PoolingType.MAX) 
        .kernelSize(2, 2) 
        .stride(2, 2) 
        .build()) 
                        ... 
.layer(3, new SubsamplingLayer.Builder( 
            SubsamplingLayer.PoolingType.MAX) 
        .kernelSize(2, 2) 
        .stride(2, 2) 
        .build()) 

Layer 2 is also a convolution layer like layer 0. Notice that we do not specify the number of channels in this layer:

.layer(2, new ConvolutionLayer.Builder(5, 5) 
        .nOut(50) 
        .activation("identity") 
        .build()) 

The fourth layer uses the DenseLayer.Builder class, as in our earlier example. As mentioned previously, the DenseLayer class is a feed-forward and fully connected layer:

.layer(4, new DenseLayer.Builder().activation("relu") 
        .nOut(500).build()) 

The layer 5 is an OutputLayer instance and uses softmax automation:

.layer(5, new OutputLayer.Builder( 
            LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD) 
        .nOut(10) 
        .activation("softmax") 
        .build()) 
        .backprop(true).pretrain(false); 

Finally, we create a new instance of the ConvolutionalLayerSetup class. We pass the builder object and the dimensions of our image (28 x 28). We also pass the number of channels, in this case, 1:

new ConvolutionLayerSetup(builder, 28, 28, 1); 

We can now configure and fit our model. We once again use the MultiLayerConfiguration and MultiLayerNetwork classes to build our network. We set up listeners and then iterate through our data. For each DataSet, we execute the fit method:

MultiLayerConfiguration conf = builder.build(); 
MultiLayerNetwork model = new MultiLayerNetwork(conf); 
model.init(); 
model.setListeners(Collections.singletonList((IterationListener) 
  new ScoreIterationListener(1/5))); 
 
while (iter.hasNext()) { 
    DataSet next = iter.next(); 
    model.fit(new DataSet(next.getFeatureMatrix(), next.getLabels())); 
} 

We are now ready to evaluate our model.

Evaluating the model

To evaluate our model, we use the Evaluation class. We get the output from our model and send it, along with the labels for our dataset, to the eval method. We then execute the stats method to get the statistical information on our network:

Evaluation evaluation = new Evaluation(4); 
INDArray output = model.output(testData.getFeatureMatrix()); 
evaluation.eval(testData.getLabels(), output); 
out.println(evaluation.stats()); 

The following is a sample output from the execution of this code, for we are only showing the results of the stats method. The first part reports on how examples are classified and the second part displays various statistics:

Examples labeled as 0 classified by model as 0: 19 times
Examples labeled as 1 classified by model as 1: 41 times
Examples labeled as 2 classified by model as 1: 4 times
Examples labeled as 2 classified by model as 2: 30 times
Examples labeled as 2 classified by model as 3: 1 times
Examples labeled as 3 classified by model as 2: 1 times
Examples labeled as 3 classified by model as 3: 28 times
==========================Scores===================================Accuracy: 0.3371
Precision: 0.8481
Recall: 0.8475
F1 Score: 0.8478
===================================================================

As in our previous model, the evaluation demonstrates decent accuracy and success with our network.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.232.239