Norm-Ranging LSH for Maximum Inner Product Search
Yan, Xiao, Li, Jinfeng, Dai, Xinyan, Chen, Hongzhi, Cheng, James
–Neural Information Processing Systems
Neyshabur and Srebro proposed SIMPLE-LSH, which is the state-of-the-art hashing based algorithm for maximum inner product search (MIPS). We found that the performance of SIMPLE-LSH, in both theory and practice, suffers from long tails in the 2-norm distribution of real datasets. We propose NORM-RANGING LSH, which addresses the excessive normalization problem caused by long tails by partitioning a dataset into sub-datasets and building a hash index for each sub-dataset independently. We prove that NORM-RANGING LSH achieves lower query time complexity than SIMPLE-LSH under mild conditions. We also show that the idea of dataset partitioning can improve another hashing based MIPS algorithm. Experiments show that NORM-RANGING LSH probes much less items than SIMPLE-LSH at the same recall, thus significantly benefiting MIPS based applications.
Neural Information Processing Systems
Dec-31-2018
- Country:
- Asia (0.14)
- North America > Canada (0.14)
- Technology:
- Information Technology
- Artificial Intelligence
- Machine Learning (0.68)
- Natural Language > Information Retrieval (0.46)
- Representation & Reasoning > Search (0.46)
- Data Science (0.68)
- Information Management > Search (0.71)
- Artificial Intelligence
- Information Technology