Reviews: Incorporating Side Information by Adaptive Convolution

Neural Information Processing Systems 

Summary of the Paper: This work proposes to use adaptive convolutions (also called'cross convolutions') to incorporate side information (e.g., camera angle) into CNN architectures for vision tasks (e.g., crowd counting). The filter weights in each adaptive convolution layer are predicted using a separate neural network (one network for each set of filter weights) with is a multi-layer perceptron. This network is referred to as'Filter Manifold Network' which takes the auxiliary side information as input and predicts the filter weights. Experiments on three vision tasks of crowd counting, digit recognition and image deconvolution indicate the potential of the proposed technique for incorporating auxiliary information. In addition, this paper contributes a new dataset for crowd counting with different camera heights and angles.