Understanding Hinton's Capsule Networks. Part IV: CapsNet Architecture

#artificialintelligence 

Encoder part of the network takes as input a 28 by 28 MNIST digit image and learns to encode it into a 16-dimensional vector of instantiation parameters (as explained in the previous posts of this series), this is where the capsules do their job. The output of the network during prediction is a 10-dimensional vectors of lengths of DigitCaps' outputs. The decoder has 3 layers: two of them are convolutional and the last one is fully connected. Convolutional layer's job is to detect basic features in the 2D image. In the CapsNet, the convolutional layer has 256 kernels with size of 9x9x1 and stride 1, followed by ReLU activation. If you don't know what this means, here are some awesome resources that will allow you to quickly pick up key ideas behind convolutions.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found