Building a multilayer convolutional network

In this final example, we'll discuss how to build a convolutional network, as shown in the following diagram. The network will consist of seven layers. First, we'll repeat two pairs of convolutional and subsampling layers with max pooling. The last subsampling layer is then connected to a densely connected feedforward neuronal network, consisting of 120 neurons, 84 neurons, and 10 neurons in the last three layers, respectively. Such a network effectively forms the complete image recognition pipeline, where the first four layers correspond to feature extraction and the last three layers correspond to the learning model:

Network configuration is initialized as we did earlier:

MultiLayerConfiguration.Builder conf = new 
   NeuralNetConfiguration.Builder()

We will specify the gradient descent algorithm and its parameters, as follows:

.seed(seed) 
.iterations(iterations) 
.activation("sigmoid") 
.weightInit(WeightInit.DISTRIBUTION) 
.dist(new NormalDistribution(0.0, 0.01)) 
.learningRate(1e-3) 
.learningRateScoreBasedDecayRate(1e-1) 
.optimizationAlgo( 
OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)

We will also specify the seven network layers, as follows:

.list(7)

The input to the first convolutional layer is the complete image, while the output is six feature maps. The convolutional layer will apply a 5 x 5 filter, and the result will be stored in a 1 x 1 cell:

.layer(0, new ConvolutionLayer.Builder( 
    new int[]{5, 5}, new int[]{1, 1}) 
    .name("cnn1") 
    .nIn(numRows*numColumns) 
    .nOut(6) 
    .build())

The second layer is a subsampling layer that will take a 2 x 2 region and store the max result in a 2 x 2 element:

.layer(1, new SubsamplingLayer.Builder( 
SubsamplingLayer.PoolingType.MAX,  
new int[]{2, 2}, new int[]{2, 2}) 
.name("maxpool1") 
.build())

The next two layers will repeat the previous two layers:

.layer(2, new ConvolutionLayer.Builder(new int[]{5, 5}, new 
   int[]{1, 1}) 
    .name("cnn2") 
    .nOut(16) 
    .biasInit(1) 
    .build()) 
.layer(3, new SubsamplingLayer.Builder
   (SubsamplingLayer.PoolingType.MAX, new 
   int[]{2, 2}, new int[]{2, 2}) 
    .name("maxpool2") 
    .build())

Now, we will wire the output of the subsampling layer into a dense feedforward network, consisting of 120 neurons, and then through another layer, into 84 neurons, as follows:

.layer(4, new DenseLayer.Builder() 
    .name("ffn1") 
    .nOut(120) 
    .build()) 
.layer(5, new DenseLayer.Builder() 
    .name("ffn2") 
    .nOut(84) 
    .build())

The final layer connects 84 neurons with 10 output neurons:

.layer(6, new OutputLayer.Builder
   (LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD) 
    .name("output") 
    .nOut(outputNum) 
    .activation("softmax") // radial basis function required 
    .build()) 
.backprop(true) 
.pretrain(false) 
.cnnInputSize(numRows,numColumns,1);

To train this structure, we can reuse the code that we developed in the previous two examples. Again, the training might take some time. The network accuracy should be around 98%.

Since model training significantly relies on linear algebra, training can be significantly sped up by using a graphics processing unit (GPU) for an order of magnitude. As the GPU backend, at the time of writing this book, is undergoing a rewrite, please check the latest documentation at http://deeplearning4j.org/documentation.

As we saw in different examples, increasingly complex neural networks allow us to extract relevant features automatically, thus completely avoiding traditional image processing. However, the price we pay for this is an increased processing time and a lot of learning examples to make this approach efficient.

Table of Contents for Building a multilayer convolutional network

Create new playlist

Sign In

Sign Up

Table of Contents for
Building a multilayer convolutional network