Goto

Collaborating Authors

 search relevance


Query Attribute Modeling: Improving search relevance with Semantic Search and Meta Data Filtering

Menon, Karthik, Haider, Batool Arhamna, Arham, Muhammad, Mehreen, Kanwal, Kadiyala, Ram Mohan Rao, Farooq, Hamza

arXiv.org Artificial Intelligence

This study introduces Query Attribute Modeling (QAM), a hybrid framework that enhances search precision and relevance by decomposing open text queries into structured metadata tags and semantic elements. QAM addresses traditional search limitations by automatically extracting metadata filters from free-form text queries, reducing noise and enabling focused retrieval of relevant items. Experimental evaluation using the Amazon Toys Reviews dataset (10,000 unique items with 40,000+ reviews and detailed product attributes) demonstrated QAM's superior performance, achieving a mean average precision at 5 (mAP@5) of 52.99\%. This represents significant improvement over conventional methods, including BM25 keyword search, encoder-based semantic similarity search, cross-encoder re-ranking, and hybrid search combining BM25 and semantic results via Reciprocal Rank Fusion (RRF). The results establish QAM as a robust solution for Enterprise Search applications, particularly in e-commerce systems.


PRECTR: A Synergistic Framework for Integrating Personalized Search Relevance Matching and CTR Prediction

Chen, Rong, Cao, Shuzhi, He, Ailong, Han, Shuguang, Chen, Jufeng

arXiv.org Artificial Intelligence

The two primary tasks in the search recommendation system are search relevance matching and click-through rate (CTR) prediction -- the former focuses on seeking relevant items for user queries whereas the latter forecasts which item may better match user interest. Prior research typically develops two models to predict the CTR and search relevance separately, then ranking candidate items based on the fusion of the two outputs. However, such a divide-and-conquer paradigm creates the inconsistency between different models. Meanwhile, the search relevance model mainly concentrates on the degree of objective text matching while neglecting personalized differences among different users, leading to restricted model performance. To tackle these issues, we propose a unified \textbf{P}ersonalized Search RElevance Matching and CTR Prediction Fusion Model(PRECTR). Specifically, based on the conditional probability fusion mechanism, PRECTR integrates the CTR prediction and search relevance matching into one framework to enhance the interaction and consistency of the two modules. However, directly optimizing CTR binary classification loss may bring challenges to the fusion model's convergence and indefinitely promote the exposure of items with high CTR, regardless of their search relevance. Hence, we further introduce two-stage training and semantic consistency regularization to accelerate the model's convergence and restrain the recommendation of irrelevant items. Finally, acknowledging that different users may have varied relevance preferences, we assessed current users' relevance preferences by analyzing past users' preferences for similar queries and tailored incentives for different candidate items accordingly. Extensive experimental results on our production dataset and online A/B testing demonstrate the effectiveness and superiority of our proposed PRECTR method.


LREF: A Novel LLM-based Relevance Framework for E-commerce

Tang, Tian, Tian, Zhixing, Zhu, Zhenyu, Wang, Chenyang, Hu, Haiqing, Tang, Guoyu, Liu, Lin, Xu, Sulong

arXiv.org Artificial Intelligence

Query and product relevance prediction is a critical component for ensuring a smooth user experience in e-commerce search. Traditional studies mainly focus on BERT-based models to assess the semantic relevance between queries and products. However, the discriminative paradigm and limited knowledge capacity of these approaches restrict their ability to comprehend the relevance between queries and products fully. With the rapid advancement of Large Language Models (LLMs), recent research has begun to explore their application to industrial search systems, as LLMs provide extensive world knowledge and flexible optimization for reasoning processes. Nonetheless, directly leveraging LLMs for relevance prediction tasks introduces new challenges, including a high demand for data quality, the necessity for meticulous optimization of reasoning processes, and an optimistic bias that can result in over-recall. To overcome the above problems, this paper proposes a novel framework called the LLM-based RElevance Framework (LREF) aimed at enhancing e-commerce search relevance. The framework comprises three main stages: supervised fine-tuning (SFT) with Data Selection, Multiple Chain of Thought (Multi-CoT) tuning, and Direct Preference Optimization (DPO) for de-biasing. We evaluate the performance of the framework through a series of offline experiments on large-scale real-world datasets, as well as online A/B testing. The results indicate significant improvements in both offline and online metrics. Ultimately, the model was deployed in a well-known e-commerce application, yielding substantial commercial benefits.


Towards Competitive Search Relevance For Inference-Free Learned Sparse Retrievers

Geng, Zhichao, Ru, Dongyu, Yang, Yang

arXiv.org Artificial Intelligence

Learned sparse retrieval, which can efficiently perform retrieval through mature inverted-index engines, has garnered growing attention in recent years. Particularly, the inference-free sparse retrievers are attractive as they eliminate online model inference in the retrieval phase thereby avoids huge computational cost, offering reasonable throughput and latency. However, even the state-of-the-art (SOTA) inference-free sparse models lag far behind in terms of search relevance when compared to both sparse and dense siamese models. Towards competitive search relevance for inference-free sparse retrievers, we argue that they deserve dedicated training methods other than using same ones with siamese encoders. In this paper, we propose two different approaches for performance improvement. First, we introduce the IDF-aware FLOPS loss, which introduces Inverted Document Frequency (IDF) to the sparsification of representations. We find that it mitigates the negative impact of the FLOPS regularization on search relevance, allowing the model to achieve a better balance between accuracy and efficiency. Moreover, we propose a heterogeneous ensemble knowledge distillation framework that combines siamese dense and sparse retrievers to generate supervisory signals during the pre-training phase. The ensemble framework of dense and sparse retriever capitalizes on their strengths respectively, providing a strong upper bound for knowledge distillation. To concur the diverse feedback from heterogeneous supervisors, we normalize and then aggregate the outputs of the teacher models to eliminate score scale differences. On the BEIR benchmark, our model outperforms existing SOTA inference-free sparse model by \textbf{3.3 NDCG@10 score}. It exhibits search relevance comparable to siamese sparse retrievers and client-side latency only \textbf{1.1x that of BM25}.


An Interpretable Ensemble of Graph and Language Models for Improving Search Relevance in E-Commerce

Choudhary, Nurendra, Huang, Edward W, Subbian, Karthik, Reddy, Chandan K.

arXiv.org Artificial Intelligence

The problem of search relevance in the E-commerce domain is a challenging one since it involves understanding the intent of a user's short nuanced query and matching it with the appropriate products in the catalog. This problem has traditionally been addressed using language models (LMs) and graph neural networks (GNNs) to capture semantic and inter-product behavior signals, respectively. However, the rapid development of new architectures has created a gap between research and the practical adoption of these techniques. Evaluating the generalizability of these models for deployment requires extensive experimentation on complex, real-world datasets, which can be non-trivial and expensive. Furthermore, such models often operate on latent space representations that are incomprehensible to humans, making it difficult to evaluate and compare the effectiveness of different models. This lack of interpretability hinders the development and adoption of new techniques in the field. To bridge this gap, we propose Plug and Play Graph LAnguage Model (PP-GLAM), an explainable ensemble of plug and play models. Our approach uses a modular framework with uniform data processing pipelines. It employs additive explanation metrics to independently decide whether to include (i) language model candidates, (ii) GNN model candidates, and (iii) inter-product behavioral signals. For the task of search relevance, we show that PP-GLAM outperforms several state-of-the-art baselines as well as a proprietary model on real-world multilingual, multi-regional e-commerce datasets. To promote better model comprehensibility and adoption, we also provide an analysis of the explainability and computational complexity of our model. We also provide the public codebase and provide a deployment strategy for practical implementation.


Improving search relevance of Azure Cognitive Search by Bayesian optimization

Agarwal, Nitin, Kumar, Ashish, R, Kiran, Gupta, Manish, Boué, Laurent

arXiv.org Artificial Intelligence

Azure Cognitive Search (ACS) has emerged as a major contender in "Search as a Service" cloud products in recent years. However, one of the major challenges for ACS users is to improve the relevance of the search results for their specific usecases. In this paper, we propose a novel method to find the optimal ACS configuration that maximizes search relevance for a specific usecase (product search, document search...) The proposed solution improves key online marketplace metrics such as click through rates (CTR) by formulating the search relevance problem as hyperparameter tuning. We have observed significant improvements in real-world search call to action (CTA) rate in multiple marketplaces by introducing optimized weights generated from the proposed approach.


Senior Software Engineer, Search Relevance (Remote) - Remote Tech Jobs

#artificialintelligence

The Senior Software Engineer, Search Relevance will work on improving the Newsela search engine to delight our teacher and students, who use search as a primary way of finding content for their educational needs. You will contribute to team efforts around relevance and bring your own ideas and experiments to the search relevance team. Additionally, you will come up with your ideas and experiments to test.You will also make changes to Search infrastructure including the ability to monitor Elasticsearch and Agatha performance. You will ideate and bring up mechanisms to reduce latency by 50 ms and/or cut the costs of spinning a search experiment by 30%. You will partner closely with data scientists, product managers, other engineers on our frontend and backend teams to bring cutting edge techniques to improve search, which is used by more than 70% of our user base to discover engaging content.


How BERT Determines Search Relevance

#artificialintelligence

In fact, when it comes to ranking results, BERT will help Search better understand one in 10 searches in the U.S. in English, and we'll bring this to more languages and locales over time. Google's remarks and explanations raise some key questions: In 2015, Crowdflower (now Appen Figure-Eight Crowdflower) hosted a Kaggle competition [2] where data scientists built models to predict the relevance for search results given a query, a product name and a product description. The winner, ChenglongChen pocketed $10,000 when his best model took first place by scoring 72.189% [3]. Although the competition has been closed for five years, the data set is still available and the Kaggle competition scoring functionality still works for the private leaderboard (it just doesn't award any site points). I pulled the data, fine tuned a BERT classification model, predicted a submission, and it scored 77.327% [4].


AI at Scale in Bing

#artificialintelligence

Every day, users from all over the world perform hundreds of millions of search queries with Bing in more than 100 languages. Whether this is the first or the millionth time we see a query, whether the best results for a query change every hour or barely change at all, our users expect an immediate answer that serves their needs. Bing web search is truly an example of AI at Scale at Microsoft, showcasing the next generation of AI capabilities and experiences. Over the past few years, Bing and Microsoft Research have been developing and deploying large neural network models such as MT-DNN, Unicoder, and UniLM to maximize the search experience for our customers. The best of those learnings are open sourced into the Microsoft Turing language models.