Reviews: Global Gated Mixture of Second-order Pooling for Improving Deep Convolutional Neural Networks

Neural Information Processing Systems 

The idea behind a Sparse Gated Mixture (GM) of Expert model has already been proposed in [1]. The main novelty of this paper is in the way the sparse mixture model is applied, namely to modify the 2nd order pooling layer within a deep CNN model to have a bank of candidates. The way GM works is as follow: Given an input sample, the sparsity-constrained gating module adaptively selects Top-K experts from N candidates according to assigned weights and outputs the weighted sum of the outputs of the K selected experts. Another contribution of this paper is to define a parameterized architecture for pooling: For the choice of expert, the authors use a modified learnable version of matrix square-root normalized second-order pooling (SR-SOP) [2] . The experiments first show that SR-SOP is advantageous over regular SOP, both are prior work, but it's good to justify why use SR-SOP in the first place.