Capsule Networks (CapsNets) are a new kind of neural network architecture for image recognition and computer vision applications. But before understanding CapsNets an understanding of Convolutional Neural Networks (CNNs) are essential.
CNNs became popular in 2012 when Alex Krizhevsky won the ImageNet competition for image classification. Ever since that CNNs have gone on to achieve human level performance in computer vision tasks. They do all these tasks by feature learning at different levels of abstraction. Lower layers learn basic shapes of different classes and higher levels learn more complex features like texture, specific patterns, etc. But this requires a large amount of data for training. Secondly, a slight change in orientation of the image can lead to total misclassifications. CapsNet is an architecture aimed at solving the shortcomings of CNNs.
CapsNets were first introduced in 2011, but in Nov 2017 Sara Sabour, Nicholas Frosst, and Geoffrey Hinton published a paper based on the CapsNet architecture where they reached state of the art accuracy on MNIST (handwritten digit images dataset) and achieved considerably better results than CNNs.
The main component of a CapsNet is called a “capsule”, which is a nested neural layer inside a neuron. These capsules are particularly good at handling different types of visual stimulus and encoding things like pose (position, size, orientation), deformation, velocity, albedo, hue, texture etc. The heart of a CapsNet Architecture is an algorithm called the “dynamic routing algorithm” which means that a higher layer capsule accepts an input from lower layer capsules if it agrees with their output. This aids CapsNets in dealing with minor changes in input or it can be said that they are equivariant. Because of their equivariant behaviour it needs very less data for training.
Instead of looking for co-existence, capsule-nets try to model relationship of different sub-parts along a hierarchy. Capsule Network is very interesting and already working model which will definitely get more developed over time and contribute to further expansion of deep learning application domains.