Information Retrieval
Multi-Modality Transformer for E-Commerce: Inferring User Purchase Intention to Bridge the Query-Product Gap
Mallapragada, Srivatsa, Xie, Ying, Chawan, Varsha Rani, Hailat, Zeyad, Wang, Yuanbo
E-commerce click-stream data and product catalogs offer critical user behavior insights and product knowledge. This paper propose a multi-modal transformer termed as PINCER, that leverages the above data sources to transform initial user queries into pseudo-product representations. By tapping into these external data sources, our model can infer users' potential purchase intent from their limited queries and capture query relevant product features. We demonstrate our model's superior performance over state-of-the-art alternatives on e-commerce online retrieval in both controlled and real-world experiments. Our ablation studies confirm that the proposed transformer architecture and integrated learning strategies enable the mining of key data sources to infer purchase intent, extract product features, and enhance the transformation pipeline from queries to more accurate pseudo-product representations.
Reviews: An algorithm for L1 nearest neighbor search via monotonic embedding
While the fundamental technical contribution is interesting and should be published, the contextualization is very far from ideal. Here are some points that the authors should address: - besides the l_1 methods based on Cauchy distribution, there is also LSH methods directly for the l_1 space: just impose a randomly shifted grid (see, e.g., the description in the CACM article by Andoni & Indyk). In particular, if we want to solve a c-approximation under l_1, this would translate into requiring a \sqrt{c} approximation for l_2 after the embedding. I.e., the resulting problem in l_2 is harder! Luckily, in l_2, LSH achieves runtime n {1/c 2} (cf, l_1 achieves runtime n {1/c}); however the corresponding l_2 algorithns are much harder (compare the algorithms from the reference [10] versus [5]).
Thrust: Adaptively Propels Large Language Models with External Knowledge
Although large-scale pre-trained language models (PTLMs) are shown to encode rich knowledge in their model parameters, the inherent knowledge in PTLMs can be opaque or static, making external knowledge necessary. However, the existing information retrieval techniques could be costly and may even introduce noisy and sometimes misleading knowledge. To address these challenges, we propose the instance-level adaptive propulsion of external knowledge (IAPEK), where we only conduct the retrieval when necessary. To achieve this goal, we propose to model whether a PTLM contains enough knowledge to solve an instance with a novel metric, Thrust, which leverages the representation distribution of a small amount of seen instances. Extensive experiments demonstrate that Thrust is a good measurement of models' instance-level knowledgeability.
Worst-case Performance of Popular Approximate Nearest Neighbor Search Implementations: Guarantees and Limitations
Graph-based approaches to nearest neighbor search are popular and powerful tools for handling large datasets in practice, but they have limited theoretical guarantees. We study the worst-case performance of recent graph-based approximate nearest neighbor search algorithms, such as HNSW, NSG and DiskANN. For DiskANN, we show that its "slow preprocessing'' version provably supports approximate nearest neighbor search query with constant approximation ratio and poly-logarithmic query time, on data sets with bounded "intrinsic'' dimension. For the other data structure variants studied, including DiskANN with "fast preprocessing'', HNSW and NSG, we present a family of instances on which the empirical query time required to achieve a "reasonable'' accuracy is linear in instance size. For example, for DiskANN, we show that the query procedure can take at least 0.1 n steps on instances of size n before it encounters any of the 5 nearest neighbors of the query.
Model-enhanced Vector Index
Embedding-based retrieval methods construct vector indices to search for document representations that are most similar to the query representations. They are widely used in document retrieval due to low latency and decent recall performance. Recent research indicates that deep retrieval solutions offer better model quality, but are hindered by unacceptable serving latency and the inability to support document updates. In this paper, we aim to enhance the vector index with end-to-end deep generative models, leveraging the differentiable advantages of deep retrieval models while maintaining desirable serving efficiency. We propose Model-enhanced Vector Index (MEVI), a differentiable model-enhanced index empowered by a twin-tower representation model.
The Query Complexity of Cake Cutting
We consider the query complexity of cake cutting in the standard query model and give lower and upper bounds for computing approximately envy-free, perfect, and equitable allocations with the minimum number of cuts. The lower bounds are tight for computing contiguous envy-free allocations among n 3 players and for computing perfect and equitable allocations with minimum number of cuts between n 2 players. For \epsilon -envy-free allocations with contiguous pieces, we also give an upper bound of O(n/\epsilon) and lower bound of \Omega(\log(1/\epsilon)) queries for any number n \geq 3 of players.We also formalize moving knife procedures and show that a large subclass of this family, which captures all the known moving knife procedures, can be simulated efficiently with arbitrarily small error in the Robertson-Webb query model.
Optimal Query Complexities for Dynamic Trace Estimation
We consider the problem of minimizing the number of matrix-vector queries needed for accurate trace estimation in the dynamic setting where our underlying matrix is changing slowly, such as during an optimization process. Specifically, for any m matrices \mathbf{A}_1,...,\mathbf{A}_m with consecutive differences bounded in Schatten- 1 norm by \alpha, we provide a novel binary tree summation procedure that simultaneously estimates all m traces up to \epsilon error with \delta failure probability with an optimal query complexity of \widetilde{O}(m \alpha\sqrt{\log(1/\delta)}/\epsilon m\log(1/\delta)), improving the dependence on both \alpha and \delta from Dharangutte and Musco (NeurIPS, 2021). By using novel reductions to communication complexity and information-theoretic analyses of Gaussian matrices, we provide matching lower bounds for static and dynamic trace estimation in all relevant parameters, including the failure probability. Our lower bounds (1) give the first tight bounds for Hutchinson's estimator in the matrix-vector product model with Frobenius norm error {\it even in the static setting}, and (2) are the first unconditional lower bounds for dynamic trace estimation, resolving open questions of prior work.
Autoregressive Search Engines: Generating Substrings as Document Identifiers
Knowledge-intensive language tasks require NLP systems to both provide the correct answer and retrieve supporting evidence for it in a given corpus. Autoregressive language models are emerging as the de-facto standard for generating answers, with newer and more powerful systems emerging at an astonishing pace. In this paper we argue that all this (and future) progress can be directly applied to the retrieval problem with minimal intervention to the models' architecture. Previous work has explored ways to partition the search space into hierarchical structures and retrieve documents by autoregressively generating their unique identifier. In this work we propose an alternative that doesn't force any structure in the search space: using all ngrams in a passage as its possible identifiers.
Fixed Point Computation: Beating Brute Force with Smoothed Analysis
Attias, Idan, Dagan, Yuval, Daskalakis, Constantinos, Yao, Rui, Zampetakis, Manolis
We propose a new algorithm that finds an $\varepsilon$-approximate fixed point of a smooth function from the $n$-dimensional $\ell_2$ unit ball to itself. We use the general framework of finding approximate solutions to a variational inequality, a problem that subsumes fixed point computation and the computation of a Nash Equilibrium. The algorithm's runtime is bounded by $e^{O(n)}/\varepsilon$, under the smoothed-analysis framework. This is the first known algorithm in such a generality whose runtime is faster than $(1/\varepsilon)^{O(n)}$, which is a time that suffices for an exhaustive search. We complement this result with a lower bound of $e^{\Omega(n)}$ on the query complexity for finding an $O(1)$-approximate fixed point on the unit ball, which holds even in the smoothed-analysis model, yet without the assumption that the function is smooth. Existing lower bounds are only known for the hypercube, and adapting them to the ball does not give non-trivial results even for finding $O(1/\sqrt{n})$-approximate fixed points.
A Method for Multi-Hop Question Answering on Persian Knowledge Graph
Ghafouri, Arash, Firouzmandi, Mahdi, Naderi, Hasan
Question answering systems are the latest evolution in information retrieval technology, designed to accept complex queries in natural language and provide accurate answers using both unstructured and structured knowledge sources. Knowledge Graph Question Answering (KGQA) systems fulfill users' information needs by utilizing structured data, representing a vast number of facts as a graph. However, despite significant advancements, major challenges persist in answering multi-hop complex questions, particularly in Persian. One of the main challenges is the accurate understanding and transformation of these multi-hop complex questions into semantically equivalent SPARQL queries, which allows for precise answer retrieval from knowledge graphs. In this study, to address this issue, a dataset of 5,600 Persian multi-hop complex questions was developed, along with their decomposed forms based on the semantic representation of the questions. Following this, Persian language models were trained using this dataset, and an architecture was proposed for answering complex questions using a Persian knowledge graph. Finally, the proposed method was evaluated against similar systems on the PeCoQ dataset. The results demonstrated the superiority of our approach, with an improvement of 12.57% in F1-score and 12.06% in accuracy compared to the best comparable method.