mip problem
Non-metric Similarity Graphs for Maximum Inner Product Search
In this paper we address the problem of Maximum Inner Product Search (MIPS) that is currently the computational bottleneck in a large number of machine learning applications. While being similar to the nearest neighbor search (NNS), the MIPS problem was shown to be more challenging, as the inner product is not a proper metric function. We propose to solve the MIPS problem with the usage of similarity graphs, i.e., graphs where each vertex is connected to the vertices that are the most similar in terms of some similarity function. Originally, the framework of similarity graphs was proposed for metric spaces and in this paper we naturally extend it to the non-metric MIPS scenario. We demonstrate that, unlike existing approaches, similarity graphs do not require any data transformation to reduce MIPS to the NNS problem and should be used for the original data. Moreover, we explain why such a reduction is detrimental for similarity graphs. By an extensive comparison to the existing approaches, we show that the proposed method is a game-changer in terms of the runtime/accuracy trade-off for the MIPS problem.
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > California > Yolo County > Davis (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- (2 more...)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (0.94)
- Information Technology > Information Management > Search (0.66)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)
Non-metric Similarity Graphs for Maximum Inner Product Search
In this paper we address the problem of Maximum Inner Product Search (MIPS) that is currently the computational bottleneck in a large number of machine learning applications. While being similar to the nearest neighbor search (NNS), the MIPS problem was shown to be more challenging, as the inner product is not a proper metric function. We propose to solve the MIPS problem with the usage of similarity graphs, i.e., graphs where each vertex is connected to the vertices that are the most similar in terms of some similarity function. Originally, the framework of similarity graphs was proposed for metric spaces and in this paper we naturally extend it to the non-metric MIPS scenario. We demonstrate that, unlike existing approaches, similarity graphs do not require any data transformation to reduce MIPS to the NNS problem and should be used for the original data. Moreover, we explain why such a reduction is detrimental for similarity graphs. By an extensive comparison to the existing approaches, we show that the proposed method is a game-changer in terms of the runtime/accuracy trade-off for the MIPS problem.
data points changes the norms of all vectors, while the norms are very important quantities in the
Shifting the data points is a good idea, but it might cause problems. In our current work, we focus on theory and datasets satisfying assumption 1. We will rephrase the sentence as follows: "In these scenarios, In the present work, we aim to improve the efficiency of the MIPS problem in algorithmic perspective. GPU can process multiple queries in parallel. Algorithm 1) and the indices of visited vertices can be arbitrarily large. We will add extra discussion in the paper and leave the details for future work. Thanks for finding out our work is interesting. It has been improving and applying to various search tasks. The goal of our paper is to fill this gap. Thank you so much for highly encouraging comments. We address your concern about the normal assumption in our response to reviewer 1. The normal assumption is indeed not necessary. We appreciate your detailed nice summary of our work. We will also change "M
Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS)
Anshumali Shrivastava, Ping Li
We present the first provably sublinear time hashing algorithm for approximate Maximum Inner Product Search (MIPS). Searching with (un-normalized) inner product as the underlying similarity measure is a known difficult problem and finding hashing schemes for MIPS was considered hard. While the existing Locality Sensitive Hashing (LSH) framework is insufficient for solving MIPS, in this paper we extend the LSH framework to allow asymmetric hashing schemes. Our proposal is based on a key observation that the problem of finding maximum inner products, after independent asymmetric transformations, can be converted into the problem of approximate near neighbor search in classical settings. This key observation makes efficient sublinear hashing scheme for MIPS possible. Under the extended asymmetric LSH (ALSH) framework, this paper provides an example of explicit construction of provably fast hashing scheme for MIPS. Our proposed algorithm is simple and easy to implement.
- North America > United States > Texas > Dallas County > Dallas (0.04)
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- North America > United States > New York > Kings County > New York City (0.04)
- (4 more...)
A Greedy Approach for Budgeted Maximum Inner Product Search
Hsiang-Fu Yu, Cho-Jui Hsieh, Qi Lei, Inderjit S. Dhillon
Maximum Inner Product Search (MIPS) is an important task in many machine learning applications such as the prediction phase of low-rank matrix factorization models and deep learning models. Recently, there has been substantial research on how to perform MIPS in sub-linear time, but most of the existing work does not have the flexibility to control the trade-off between search efficiency and search quality. In this paper, we study the important problem of MIPS with a computational budget. By carefully studying the problem structure of MIPS, we develop a novel Greedy-MIPS algorithm, which can handle budgeted MIPS by design. While simple and intuitive, Greedy-MIPS yields surprisingly superior performance compared to state-of-the-art approaches. As a specific example, on a candidate set containing half a million vectors of dimension 200, Greedy-MIPS runs 200x faster than the naive approach while yielding search results with the top-5 precision greater than 75%.
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > California > Yolo County > Davis (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- (2 more...)
Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS)
We present the first provably sublinear time hashing algorithm for approximate Maximum Inner Product Search (MIPS). Searching with (un-normalized) inner product as the underlying similarity measure is a known difficult problem and finding hashing schemes for MIPS was considered hard. While the existing Locality Sensitive Hashing (LSH) framework is insufficient for solving MIPS, in this paper we extend the LSH framework to allow asymmetric hashing schemes. Our proposal is based on a key observation that the problem of finding maximum inner products, after independent asymmetric transformations, can be converted into the problem of approximate near neighbor search in classical settings. This key observation makes efficient sublinear hashing scheme for MIPS possible. Under the extended asymmetric LSH (ALSH) framework, this paper provides an example of explicit construction of provably fast hashing scheme for MIPS. Our proposed algorithm is simple and easy to implement.
- North America > United States > Texas > Dallas County > Dallas (0.04)
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- North America > United States > New York > Kings County > New York City (0.04)
- (4 more...)
Branch Ranking for Efficient Mixed-Integer Programming via Offline Ranking-based Policy Learning
Huang, Zeren, Chen, Wenhao, Zhang, Weinan, Shi, Chuhan, Liu, Furui, Zhen, Hui-Ling, Yuan, Mingxuan, Hao, Jianye, Yu, Yong, Wang, Jun
Deriving a good variable selection strategy in branch-and-bound is essential for the efficiency of modern mixed-integer programming (MIP) solvers. With MIP branching data collected during the previous solution process, learning to branch methods have recently become superior over heuristics. As branch-and-bound is naturally a sequential decision making task, one should learn to optimize the utility of the whole MIP solving process instead of being myopic on each step. In this work, we formulate learning to branch as an offline reinforcement learning (RL) problem, and propose a long-sighted hybrid search scheme to construct the offline MIP dataset, which values the long-term utilities of branching decisions. During the policy training phase, we deploy a ranking-based reward assignment scheme to distinguish the promising samples from the long-term or short-term view, and train the branching model named Branch Ranking via offline policy learning. Experiments on synthetic MIP benchmarks and real-world tasks demonstrate that Branch Rankink is more efficient and robust, and can better generalize to large scales of MIP instances compared to the widely used heuristics and state-of-the-art learning-based branching models.