A Experiment Details

Neural Information Processing Systems 

Given the differences between the training procedures of the model presented in Section 6.2, and those All models in Section 6.3 were trained with stochastic gradient descent on batches of size All models presented in this paper make use of the same 3-Layer MLP for parameterizing the encoders and decoders. This is then divided into 18 capsules, each of 18 dimensions. The decoder layers then have output sizes (450, 675, 4096). For all topographic models (TV AE and BubbleV AE) in Section 6.3, the global topographic organization afforded by These values were chosen to be sufficiently large to achieve notably lower equivariance error than the V AE baseline, and thus demonstrate the impact of topographic organization without temporal coherence. The results of all models are shown in Section B below.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found