Goto

Collaborating Authors

 group convolution layer


GIFT: Learning Transformation-Invariant Dense Visual Descriptors via Group CNNs

Neural Information Processing Systems

To achieve the invariance to viewpoints, traditional methods [36, 37] use patch detectors [33, 39] to extract transformation covariant local patches which are then normalized for transformation invariance. Then, invariant descriptors can be extracted on the detected local patches. However, a typical image may have very few pixels for which viewpoint covariant patches can be reliably detected[22].



Reviews: Group Equivariant Capsule Networks

Neural Information Processing Systems

Authors present a modification of Capsule networks which guarantees equivarience to SO(2) group of transformations. Since restricting the pose matrices of a capsule network to operate inside the group degrades the performance of the network, they also suggest a method for combining group convolutional layers with capsule layers. Although the theoretical aspect of this work is strong, but experimental evaluations are quite limited without a proper comparison to baselines andc other works. Pros: The paper is well written and conveys the idea clearly. Capsule networks were proposed initially with the promise of better generalization in terms of affine transformations and viewpoint invarience.


Neural Forest Learning

arXiv.org Machine Learning

We propose Neural Forest Learning (NFL), a novel deep learning based random-forest-like method. In contrast to previous forest methods, NFL enjoys the benefits of end-to-end, data-driven representation learning, as well as pervasive support from deep learning software and hardware platforms, hence achieving faster inference speed and higher accuracy than previous forest methods. Furthermore, NFL learns non-linear feature representations in CNNs more efficiently than previous higher-order pooling methods, producing good results with negligible increase in parameters, floating point operations (FLOPs) and real running time. We achieve superior performance on 7 machine learning datasets when compared to random forests and GBDTs. On the fine-grained benchmarks CUB-200-2011, FGVC-aircraft and Stanford Cars, we achieve over 5.7%, 6.9% and 7.8% gains for VGG-16, respectively. Moreover, NFL can converge in much fewer epochs, further accelerating network training. On the large-scale ImageNet ILSVRC-12 validation set, integration of NFL into ResNet-18 achieves top-1/top-5 errors of 28.32%/9.77%, which outperforms ResNet-18 by 1.92%/1.15% with negligible extra cost and the improvement is consistent under various architectures.