AITopics

We present Lean Finder, a semantic search engine for Lean and mathlib that understands and aligns with the intents of mathematicians. We further align Lean Finder with mathematicians' preferences using In addition, Lean Finder is compatible with LLM-based theorem provers, bridging retrieval with formal reasoning. Advances in Lean and mathlib (De Moura et al., 2015; Moura & Ullrich, 2021) are turning mathematical discovery into a collaborative and verifiable research workflow. Despite these advances, state-of-the-art LLMs still cannot solve math research problems. Lean's syn tax, gram mar, and tac tics in cur a steep learn ing curve. All experiments and data processing were conducted outside Meta. Figure 1: In the evaluation with user queries, real users preferred Lean Finder in 81.6% of cases, compared with Consider the two queries below. Lean search engines handle (Gao et al., 2024a;b; Ju & Dong, 2025; Asher, 2025): Denote L/K a field extension, x, y in L are algebraic elements over K with the same minimal polynomial. I'm working with algebraic elements over a field extension and I have two elements, say x and y in L. I know x is algebraic over K, and I've shown that y is a root of the minimal polynomial of x. Does this imply that the minimal polynomials of x and y are actually equal? T arget Statement 2: 1 theorem eq_of_root {x y: L} (hx: IsAlgebraic K x) (h_ev: Polynomial.aeval y (minpoly K x) = 0): minpoly K y = minpoly K x):= -- proof omitted for brevity This user latent (motivation, perspective, abstraction) cannot be inferred or encoded by a purely syntactic informalization. Addressing this challenge calls for Lean search engines that can understand a mathematician's intent, not merely We defer a more rigorous analysis in Section 2.2, and ask our core question: Our approach analyzes and clusters public discussions, then synthesizes queries that simulate user intents (Section 3.1).

large language model, logic & formal reasoning, machine learning, (22 more...)

2510.1594

Genre: Research Report > New Finding (0.92)

Industry:

Information Technology (1.00)
Law (0.92)
Government (0.92)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

MiLQ: Benchmarking IR Models for Bilingual Web Search with Mixed Language Queries

Kim, Jonghwi, Kang, Deokhyung, Hwang, Seonjeong, Kim, Yunsu, Ok, Jungseul, Lee, Gary

Despite bilingual speakers frequently using mixed-language queries in web searches, Information Retrieval (IR) research on them remains scarce. To address this, we introduce MiLQ, Mixed-Language Query test set, the first public benchmark of mixed-language queries, qualified as realistic and relatively preferred. Experiments show that multilingual IR models perform moderately on MiLQ and inconsistently across native, English, and mixed-language queries, also suggesting code-switched training data's potential for robust IR models handling such queries. Meanwhile, intentional English mixing in queries proves an effective strategy for bilinguals searching English documents, which our analysis attributes to enhanced token matching compared to native queries.

information retrieval, large language model, machine learning, (20 more...)

2505.16631

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.36)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Information Management > Search (0.84)
(2 more...)

Danu, Manuela Daniela, Marica, George, Suciu, Constantin, Itu, Lucian Mihai, Farri, Oladimeji

Multilingual Clinical NER for Diseases and Medications Recognition in Cardiology Texts using BERT Embeddings

The rapidly increasing volume of electronic health record (EHR) data underscores a pressing need to unlock biomedical knowledge from unstructured clinical texts to support advancements in data-driven clinical systems, including patient diagnosis, disease progression monitoring, treatment effects assessment, prediction of future clinical events, etc. While contextualized language models have demonstrated impressive performance improvements for named entity recognition (NER) systems in English corpora, there remains a scarcity of research focused on clinical texts in low-resource languages. To bridge this gap, our study aims to develop multiple deep contextual embedding models to enhance clinical NER in the cardiology domain, as part of the BioASQ MultiCardioNER shared task. We explore the effectiveness of different monolingual and multilingual BERT-based models, trained on general domain text, for extracting disease and medication mentions from clinical case reports written in English, Spanish, and Italian. We achieved an F1-score of 77.88% on Spanish Diseases Recognition (SDR), 92.09% on Spanish Medications Recognition (SMR), 91.74% on English Medications Recognition (EMR), and 88.9% on Italian Medications Recognition (IMR). These results outperform the mean and median F1 scores in the test leaderboard across all subtasks, with the mean/median values being: 69.61%/75.66% for SDR, 81.22%/90.18% for SMR, 89.2%/88.96% for EMR, and 82.8%/87.76% for IMR.

large language model, machine learning, recognition, (21 more...)

2510.17437

Country:

Europe (1.00)
North America > United States (0.68)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Health Care Technology > Medical Record (0.69)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.63)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Li, Tongyang, Wang, Xinzhao, Zhang, Yexin

Near-Optimal Quantum Algorithms for Computing (Coarse) Correlated Equilibria of General-Sum Games

Computing Nash equilibria of zero-sum games in classical and quantum settings is extensively studied. For general-sum games, computing Nash equilibria is PPAD-hard and the computing of a more general concept called correlated equilibria has been widely explored in game theory. In this paper, we initiate the study of quantum algorithms for computing $\varepsilon$-approximate correlated equilibria (CE) and coarse correlated equilibria (CCE) in multi-player normal-form games. Our approach utilizes quantum improvements to the multi-scale Multiplicative Weight Update (MWU) method for CE calculations, achieving a query complexity of $\tilde{O}(m\sqrt{n})$ for fixed $\varepsilon$. For CCE, we extend techniques from quantum algorithms for zero-sum games to multi-player settings, achieving query complexity $\tilde{O}(m\sqrt{n}/\varepsilon^{2.5})$. Both algorithms demonstrate a near-optimal scaling in the number of players $m$ and actions $n$, as confirmed by our quantum query lower bounds.

information retrieval, machine learning, natural language, (20 more...)

2510.16782

Genre: Research Report (0.63)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.56)

Nardini, Franco Maria, Perego, Raffaele, Tonellotto, Nicola, Trani, Salvatore

Blending Learning to Rank and Dense Representations for Efficient and Effective Cascades

We investigate the exploitation of both lexical and neural relevance signals for ad-hoc passage retrieval. Our exploration involves a large-scale training dataset in which dense neural representations of MS-MARCO queries and passages are complemented and integrated with 253 hand-crafted lexical features extracted from the same corpus. Blending of the relevance signals from the two different groups of features is learned by a classical Learning-to-Rank (LTR) model based on a forest of decision trees. To evaluate our solution, we employ a pipelined architecture where a dense neural retriever serves as the first stage and performs a nearest-neighbor search over the neural representations of the documents. Our LTR model acts instead as the second stage that re-ranks the set of candidates retrieved by the first stage to enhance effectiveness. The results of reproducible experiments conducted with state-of-the-art dense retrievers on publicly available resources show that the proposed solution significantly enhances the end-to-end ranking performance while relatively minimally impacting efficiency. Specifically, we achieve a boost in nDCG@10 of up to 11% with an increase in average query latency of only 4.3%. This confirms the advantage of seamlessly combining two distinct families of signals that mutually contribute to retrieval effectiveness.

information retrieval, machine learning, natural language, (13 more...)

2510.16393

Country:

Europe (1.00)
North America > United States (0.69)

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.91)

arXiv.org Artificial IntelligenceOct-20-2025

AI Guided Accelerator For Search Experience

Yetukuri, Jayanth, Elyasi, Mehran, Agrawal, Samarth, Mandal, Aritra, Kong, Rui, Vempati, Harish, Khan, Ishita

Effective query reformulation is pivotal in narrowing the gap between a user's exploratory search behavior and the identification of relevant products in e-commerce environments. While traditional approaches predominantly model query rewrites as isolated pairs, they often fail to capture the sequential and transitional dynamics inherent in real-world user behavior. In this work, we propose a novel framework that explicitly models transitional queries--intermediate reformulations occurring during the user's journey toward their final purchase intent. By mining structured query trajectories from eBay's large-scale user interaction logs, we reconstruct query sequences that reflect shifts in intent while preserving semantic coherence. This approach allows us to model a user's shopping funnel, where mid-journey transitions reflect exploratory behavior and intent refinement. Furthermore, we incorporate generative Large Language Models (LLMs) to produce semantically diverse and intent-preserving alternative queries, extending beyond what can be derived through collaborative filtering alone. These reformulations can be leveraged to populate Related Searches or to power intent-clustered carousels on the search results page, enhancing both discovery and engagement. Our contributions include (i) the formal identification and modeling of transitional queries, (ii) the introduction of a structured query sequence mining pipeline for intent flow understanding, and (iii) the application of LLMs for scalable, intent-aware query expansion. Empirical evaluation demonstrates measurable gains in conversion and engagement metrics compared to the existing Related Searches module, validating the effectiveness of our approach in real-world e-commerce settings.

artificial intelligence, large language model, natural language, (21 more...)

2508.05649

Country: North America > United States (0.47)

Genre: Research Report (0.82)

Industry: Information Technology > Services (0.82)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.89)

MERLIN: A Testbed for Multilingual Multimodal Entity Recognition and Linking

Ramamoorthy, Sathyanarayanan, Shah, Vishwa, Khanuja, Simran, Sheikh, Zaid, Jie, Shan, Chia, Ann, Chua, Shearman, Neubig, Graham

This paper introduces MERLIN, a novel testbed system for the task of Multilingual Multimodal Entity Linking. The created dataset includes BBC news article titles, paired with corresponding images, in five languages: Hindi, Japanese, Indonesian, Vietnamese, and Tamil, featuring over 7,000 named entity mentions linked to 2,500 unique Wikidata entities. We also include several benchmarks using multilingual and multimodal entity linking methods exploring different language models like LLaMa-2 and Aya-23. Our findings indicate that incorporating visual data improves the accuracy of entity linking, especially for entities where the textual context is ambiguous or insufficient, and particularly for models that do not have strong multilingual abilities. For the work, the dataset, methods are available here at https://github.com/rsathya4802/merlin

computational linguistic, information retrieval, large language model, (21 more...)

2510.14307

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology (0.68)
Government (0.68)
Leisure & Entertainment > Sports (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.50)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.48)

Rewriting History: A Recipe for Interventional Analyses to Study Data Effects on Model Behavior

Nadkarni, Rahul, Elazar, Yanai, Gonen, Hila, Smith, Noah A.

We present an experimental recipe for studying the relationship between training data and language model (LM) behavior. We outline steps for intervening on data batches -- i.e., ``rewriting history'' -- and then retraining model checkpoints over that data to test hypotheses relating data to behavior. Our recipe breaks down such an intervention into stages that include selecting evaluation items from a benchmark that measures model behavior, matching relevant documents to those items, and modifying those documents before retraining and measuring the effects. We demonstrate the utility of our recipe through case studies on factual knowledge acquisition in LMs, using both cooccurrence statistics and information retrieval methods to identify documents that might contribute to knowledge learning. Our results supplement past observational analyses that link cooccurrence to model behavior, while demonstrating that extant methods for identifying relevant training documents do not fully explain an LM's ability to correctly answer knowledge questions. Overall, we outline a recipe that researchers can follow to test further hypotheses about how training data affects model behavior. Our code is made publicly available to promote future work.

computational linguistic, large language model, machine learning, (16 more...)

2510.14261

Country:

Europe (1.00)
North America > United States (0.67)
North America > Canada (0.46)
Asia > Middle East (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.86)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)

Mansoori, Mohammad, Soliman, Amira, Etminani, Farzaneh

LTR-ICD: A Learning-to-Rank Approach for Automatic ICD Coding

Clinical notes contain unstructured text provided by clinicians during patient encounters. These notes are usually accompanied by a sequence of diagnostic codes following the International Classification of Diseases (ICD). Correctly assigning and ordering ICD codes are essential for medical diagnosis and reimbursement. However, automating this task remains challenging. State-of-the-art methods treated this problem as a classification task, leading to ignoring the order of ICD codes that is essential for different purposes. In this work, as a first attempt, we approach this task from a retrieval system perspective to consider the order of codes, thus formulating this problem as a classification and ranking task. Our results and analysis show that the proposed framework has a superior ability to identify high-priority codes compared to other methods. For instance, our model accuracy in correctly ranking primary diagnosis codes is 47%, compared to 20% for the state-of-the-art classifier. Additionally, in terms of classification metrics, the proposed model achieves a micro- and macro-F1 scores of 0.6065 and 0.2904, respectively, surpassing the previous best model with scores of 0.597 and 0.2660.

information retrieval, machine learning, natural language, (19 more...)

2510.13922

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.88)

Industry:

Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.88)

Yuan, Ye, Shabani, Mohammad Amin, Liu, Siqi

FACTS: Table Summarization via Offline Template Generation with Agentic Workflows

Query-focused table summarization requires generating natural language summaries of tabular data conditioned on a user query, enabling users to access insights beyond fact retrieval. Existing approaches face key limitations: table-to-text models require costly fine-tuning and struggle with complex reasoning, prompt-based LLM methods suffer from token-limit and efficiency issues while exposing sensitive data, and prior agentic pipelines often rely on decomposition, planning, or manual templates that lack robustness and scalability. To mitigate these issues, we introduce an agentic workflow, FACTS, a Fast, Accurate, and Privacy-Compliant Table Summarization approach via Offline Template Generation. FACTS produces offline templates, consisting of SQL queries and Jinja2 templates, which can be rendered into natural language summaries and are reusable across multiple tables sharing the same schema. It enables fast summarization through reusable offline templates, accurate outputs with executable SQL queries, and privacy compliance by sending only table schemas to LLMs. Evaluations on widely-used benchmarks show that FACTS consistently outperforms baseline methods, establishing it as a practical solution for real-world query-focused table summarization.

large language model, natural language, template, (19 more...)

2510.1392

Country:

North America > United States (1.00)
Asia (1.00)

Genre: Research Report (0.82)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.46)