How it works...

CapsNets are very different from state-of-the-art deep learning networks. Instead of adding more layers and making the network deeper, CapsNets use a shallow network where capsule layers are nested inside other layers. Each capsule specializes in detecting a specific entity in an image, and a dynamic routing mechanism is used to send the detected entity to parents layers. With CNNs you have to consider thousands of images from many different perspectives in order to recognize an object from different angles. Hinton believes the redundancies in the layers will allow capsule networks to identify objects from multiple angles and in different scenarios with less data that is typically used by CNNs. Let's examine the network as shown by tensorboad:

An example of CapsNet as defined in the code and shown by tensorboard

The results are impressive as shown in the following diagram taken from the seminal paper. CapsNet has a low test error (0.25 percent) on a three-layer network previously only achieved in deeper networks. The baseline is a standard CNN with three convolutional layers of 256,256—128 channels. Each has 5 x 5 kernels and a stride of 1. The last convolutional layers are followed by two fully connected layers of size 328,192. The last fully connected layer is connected with dropout to a 10 class softmax layer with cross-entropy losses:

Let's examine the reduction of margin loss, reconstruction loss, and total loss:

Let's also examine the increase in accuracy; after 500 iterations it achieves 92 percent and 98.46 percent in 3,500 iterations:

Iteration	Accuracy
500	0.922776442308
1000	0.959735576923
1500	0.971955128205
2000	0.978365384615
2500	0.981770833333
3000	0.983473557692
3500	0.984675480769

Examples of increase in accuracy for CapsNet

Table of Contents for How it works...

Create new playlist

Sign In

Sign Up

Table of Contents for
How it works...