hypersphere
Hyperspherical Prototype Networks
Pascal Mettes, Elise van der Pol, Cees Snoek
This paper introduces hyperspherical prototype networks, which unify classification and regression with prototypes on hyperspherical output spaces. For classification, a common approach is to define prototypes as the mean output vector over training examples per class. Here, we propose to use hyperspheres as output spaces, with class prototypes defined a priori with large margin separation. We position prototypes through data-independent optimization, with an extension to incorporate priors from class semantics. By doing so, we do not require any prototype updating, we can handle any training size, and the output dimensionality is no longer constrained to the number of classes. Furthermore, we generalize to regression, by optimizing outputs as an interpolation between two prototypes on the hypersphere. Since both tasks are now defined by the same loss function, they can be jointly trained for multi-task problems. Experimentally, we show the benefit of hyperspherical prototype networks for classification, regression, and their combination over other prototype methods, softmax cross-entropy, and mean squared error approaches.
02a32ad2669e6fe298e607fe7cc0e1a0-AuthorFeedback.pdf
We thank all the reviewers (R1,R2,R3) for their feedback and suggestions.1 Table A: Multi-task comparison across task weights. We have per-2 formed loss balancing with five different weights t3 in the multi-task loss Lm = t Lc +(1 t) Lr for4 the classification and regression losses. The results5 on OmniArt are reported in Table A. Our proposal6 is robust to the weight value, tuning the task weight7 is not vital. We obtain a moderate gain for both clas-8 sification and regression with a weight of t = 0.25.9 For the multi-task baseline, emphasizing regression10 reduces the regression error, as the gradient magnitude of the regression loss is much lower than the one for the11 classification loss.
Deep Hyperspherical Learning
Convolution as inner product has been the founding basis of convolutional neural networks (CNNs) and the key to end-to-end visual representation learning. Benefiting from deeper architectures, recent CNNs have demonstrated increasingly strong representation abilities. Despite such improvement, the increased depth and larger parameter space have also led to challenges in properly training a network. In light of such challenges, we propose hyperspherical convolution (SphereConv), a novel learning framework that gives angular representations on hyperspheres. We introduce SphereNet, deep hyperspherical convolution networks that are distinct from conventional inner product based convolutional networks.