agrlearn
Aggregated Learning: A Vector-Quantization Approach to Learning Neural Network Classifiers
Soflaei, Masoumeh, Guo, Hongyu, Al-Bashabsheh, Ali, Mao, Yongyi, Zhang, Richong
We consider the problem of learning a neural network classifier. Under the information bottleneck (IB) principle, we associate with this classification problem a representation learning problem, which we call "IB learning". We show that IB learning is, in fact, equivalent to a special class of the quantization problem. The classical results in rate-distortion theory then suggest that IB learning can benefit from a "vector quantization" approach, namely, simultaneously learning the representations of multiple input objects. Such an approach assisted with some variational techniques, result in a novel learning framework, "Aggregated Learning", for classification with neural network models. In this framework, several objects are jointly classified by a single neural network. The effectiveness of this framework is verified through extensive experiments on standard image recognition and text classification tasks. Introduction The revival of neural networks in the paradigm of deep learning (LeCun, Bengio, and Hinton 2015) has stimulated intense interest in understanding the networking of deep neural networks, e.g., (Shwartz-Ziv and Tishby 2017; Zhang et al. 2017). Among various efforts, an information-theoretic approach, information bottleneck (IB) (Tishby, Pereira, and Bialek 1999) stands out as a fundamental tool to theorize the learning of deep neural networks (Shwartz-Ziv and Tishby 2017; Saxe et al. 2018; Dai et al. 2018). Under the IB principle, the core of learning a neural network classifier is to find a representation T of the input example X, that contains as much as possible the information about X and as little as possible the information about the label Y .
Aggregated Learning: A Vector Quantization Approach to Learning with Neural Networks
Guo, Hongyu, Mao, Yongyi, Zhang, Richong
We establish an equivalence between information bottleneck (IB) learning and an unconventional quantization problem, "IB quantization". Under this equivalence, standard neural network models correspond to scalar IB quantizers. We prove a coding theorem for IB quantization, which implies that scalar IB quantizers are in general inferior to vector IB quantizers. This inspires us to develop a learning framework for neural networks, AgrLearn, that corresponds to vector IB quantizers. We experimentally verify that AgrLearn applied to some deep network models of current art improves upon them, while requiring less training data. With a heuristic smoothing, AgrLearn further improves its performance, resulting in new state of the art in image classification on Cifar10.