Angular Quantization-based Binary Codes for Fast Similarity Search

Gong, Yunchao, Kumar, Sanjiv, Verma, Vishal, Lazebnik, Svetlana

Neural Information Processing Systems 

This paper focuses on the problem of learning binary embeddings for efficient retrieval of high-dimensional non-negative data. Such data typically arises in a large number of vision and text applications where counts or frequencies are used as features. Also, cosine distance is commonly used as a measure of dissimilarity between such vectors. In this work, we introduce a novel spherical quantization scheme to generate binary embedding of such data and analyze its properties. The number of quantization landmarks in this scheme grows exponentially with data dimensionality resulting in low-distortion quantization.