AITopics | Case-Based Reasoning

Collaborating Authors

Case-Based Reasoning

"At the highest level of generality, a general CBR cycle may be described by the following four processes:

RETRIEVE the most similar case or cases
REUSE the information and knowledge in that case to solve the problem
REVISE the proposed solution
RETAIN the parts of this experience likely to be useful for future problem solving "

– Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches. Agnar Aamodt & Enric Plaza. AI Communications. IOS Press, Vol. 7: 1, pp. 39-59.

News Overviews Instructional Materials AI-Alerts Classics

Long-Tail Crisis in Nearest Neighbor Language Models

Nishida, Yuto, Morishita, Makoto, Deguchi, Hiroyuki, Kamigaito, Hidetaka, Watanabe, Taro

arXiv.org Artificial IntelligenceMar-28-2025

The $k$-nearest-neighbor language model ($k$NN-LM), one of the retrieval-augmented language models, improves the perplexity for given text by directly accessing a large datastore built from any text data during inference. A widely held hypothesis for the success of $k$NN-LM is that its explicit memory, i.e., the datastore, enhances predictions for long-tail phenomena. However, prior works have primarily shown its ability to retrieve long-tail contexts, leaving the model's performance remain underexplored in estimating the probabilities of long-tail target tokens during inference. In this paper, we investigate the behavior of $k$NN-LM on low-frequency tokens, examining prediction probability, retrieval accuracy, token distribution in the datastore, and approximation error of the product quantization. Our experimental results reveal that $k$NN-LM does not improve prediction performance for low-frequency tokens but mainly benefits high-frequency tokens regardless of long-tail contexts in the datastore.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.22426

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
North America > United States (0.04)
North America > Dominican Republic (0.04)
(3 more...)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Optimizing Case-Based Reasoning System for Functional Test Script Generation with Large Language Models

Guo, Siyuan, Liu, Huiwu, Chen, Xiaolong, Xie, Yuming, Zhang, Liang, Han, Tao, Chen, Hechang, Chang, Yi, Wang, Jun

arXiv.org Artificial IntelligenceMar-26-2025

In this work, we explore the potential of large language models (LLMs) for generating functional test scripts, which necessitates understanding the dynamically evolving code structure of the target software. To achieve this, we propose a case-based reasoning (CBR) system utilizing a 4R cycle (i.e., retrieve, reuse, revise, and retain), which maintains and leverages a case bank of test intent descriptions and corresponding test scripts to facilitate LLMs for test script generation. To improve user experience further, we introduce Re4, an optimization method for the CBR system, comprising reranking-based retrieval finetuning and reinforced reuse finetuning. Specifically, we first identify positive examples with high semantic and script similarity, providing reliable pseudo-labels for finetuning the retriever model without costly labeling. Then, we apply supervised finetuning, followed by a reinforcement learning finetuning stage, to align LLMs with our production scenarios, ensuring the faithful reuse of retrieved cases. Extensive experimental results on two product development units from Huawei Datacom demonstrate the superiority of the proposed CBR+Re4. Notably, we also show that the proposed Re4 method can help alleviate the repetitive generation issues with LLMs.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.20576

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > China > Jilin Province > Changchun (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre:

Research Report (1.00)
Overview (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Interpretable Machine Learning for Oral Lesion Diagnosis through Prototypical Instances Identification

Cascione, Alessio, Setzu, Mattia, Galatolo, Federico A., Cimino, Mario G. C. A., Guidotti, Riccardo

arXiv.org Artificial IntelligenceMar-21-2025

Decision-making processes in healthcare can be highly complex and challenging. Machine Learning tools offer significant potential to assist in these processes. However, many current methodologies rely on complex models that are not easily interpretable by experts. This underscores the need to develop interpretable models that can provide meaningful support in clinical decision-making. When approaching such tasks, humans typically compare the situation at hand to a few key examples and representative cases imprinted in their memory. Using an approach which selects such exemplary cases and grounds its predictions on them could contribute to obtaining high-performing interpretable solutions to such problems. To this end, we evaluate PivotTree, an interpretable prototype selection model, on an oral lesion detection problem, specifically trying to detect the presence of neoplastic, aphthous and traumatic ulcerated lesions from oral cavity images. We demonstrate the efficacy of using such method in terms of performance and offer a qualitative and quantitative comparison between exemplary cases and ground-truth prototypes selected by experts.

artificial intelligence, inductive learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2503.16938

Country:

North America > United States > Wisconsin (0.04)
Europe > Italy > Tuscany (0.04)

Genre:

Overview (0.68)
Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine (0.93)
Health & Medicine > Therapeutic Area > Dermatology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.47)
(2 more...)

Add feedback

N2C2: Nearest Neighbor Enhanced Confidence Calibration for Cross-Lingual In-Context Learning

He, Jie, Yu, Simon, Xiong, Deyi, Gutiérrez-Basulto, Víctor, Pan, Jeff Z.

arXiv.org Artificial IntelligenceMar-12-2025

Recent advancements of in-context learning (ICL) show language models can significantly improve their performance when demonstrations are provided. However, little attention has been paid to model calibration and prediction confidence of ICL in cross-lingual scenarios. To bridge this gap, we conduct a thorough analysis of ICL for cross-lingual sentiment classification. Our findings suggest that ICL performs poorly in cross-lingual scenarios, exhibiting low accuracy and presenting high calibration errors. In response, we propose a novel approach, N2C2, which employs a -nearest neighbors augmented classifier for prediction confidence calibration. N2C2 narrows the prediction gap by leveraging a datastore of cached few-shot instances. Specifically, N2C2 integrates the predictions from the datastore and incorporates confidence-aware distribution, semantically consistent retrieval representation, and adaptive neighbor combination modules to effectively utilize the limited number of supporting instances. Evaluation on two multilingual sentiment classification datasets demonstrates that N2C2 outperforms traditional ICL. It surpasses fine tuning, prompt tuning and recent state-of-the-art methods in terms of accuracy and calibration errors.

calibration, computational linguistic, proceedings, (14 more...)

arXiv.org Artificial Intelligence

2503.09218

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.05)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
North America > Dominican Republic (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.61)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.54)
(4 more...)

Add feedback

AI-Driven Decision Support in Oncology: Evaluating Data Readiness for Skin Cancer Treatment

Grüger, Joscha, Geyer, Tobias, Brix, Tobias, Storck, Michael, Leson, Sonja, Bley, Laura, Weishaupt, Carsten, Bergmann, Ralph, Braun, Stephan A.

arXiv.org Artificial IntelligenceMar-12-2025

Over the past few years, the field of artificial intelligence (AI) has shown great promise in various domains, including medicine. A potential use case for AI in medicine is its application in managing advanced-stage cancer treatment, where limited evidence often makes treatment choices reliant on the personal expertise of the physicians. The complex nature of oncological disease processes and the multitude of factors that need to be considered when making treatment decisions make it difficult to rely solely on evidence-based trial data, which is often limited and may exclude certain patient populations. This results in physicians making decisions on a case-by-case basis, drawing on their experience of previous cases, which is not always objective and may be limited by the small number of cases they have observed. In this context, the use of clinical decision support systems (CDSS) using similaritybased AI approaches can potentially contribute to better oncology treatment by supporting physicians in the selection of treatment methods [1, 2]. One approach is Case-Based Reasoning (CBR), a subfield of AI that deals with experience-based problem solving.

data quality, information, treatment decision, (14 more...)

arXiv.org Artificial Intelligence

2503.09164

Country:

Europe > Germany > North Rhine-Westphalia > Münster Region > Münster (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Germany > North Rhine-Westphalia > Düsseldorf Region > Düsseldorf (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.66)

Industry: Health & Medicine > Therapeutic Area > Oncology > Skin Cancer (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Applied AI (1.00)

Add feedback

How Vital is the Jurisprudential Relevance: Law Article Intervened Legal Case Retrieval and Matching

Xu, Nuo, Wang, Pinghui, Liang, Zi, Zhao, Junzhou, Guan, Xiaohong

arXiv.org Artificial IntelligenceFeb-25-2025

Legal case retrieval (LCR) aims to automatically scour for comparable legal cases based on a given query, which is crucial for offering relevant precedents to support the judgment in intelligent legal systems. Due to similar goals, it is often associated with a similar case matching (LCM) task. To address them, a daunting challenge is assessing the uniquely defined legal-rational similarity within the judicial domain, which distinctly deviates from the semantic similarities in general text retrieval. Past works either tagged domain-specific factors or incorporated reference laws to capture legal-rational information. However, their heavy reliance on expert or unrealistic assumptions restricts their practical applicability in real-world scenarios. In this paper, we propose an end-to-end model named LCM-LAI to solve the above challenges. Through meticulous theoretical analysis, LCM-LAI employs a dependent multi-task learning framework to capture legal-rational information within legal cases by a law article prediction (LAP) sub-task, without any additional assumptions in inference. Besides, LCM-LAI proposes an article-aware attention mechanism to evaluate the legal-rational similarity between across-case sentences based on law distribution, which is more effective than conventional semantic similarity. Weperform a series of exhaustive experiments including two different tasks involving four real-world datasets. Results demonstrate that LCM-LAI achieves state-of-the-art performance.

law article, lcm-lai, representation, (13 more...)

arXiv.org Artificial Intelligence

2502.18292

Country:

Asia > China > Shaanxi Province > Xi'an (0.05)
North America > Canada (0.04)
Europe > United Kingdom (0.04)
(4 more...)

Genre: Research Report > New Finding (0.66)

Industry: Law > Criminal Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (0.91)

Add feedback

Explaining the Success of Nearest Neighbor Methods in Prediction

Chen, George H., Shah, Devavrat

arXiv.org Machine LearningFeb-21-2025

Many modern methods for prediction leverage nearest neighbor search to find past training examples most similar to a test example, an idea that dates back in text to at least the 11th century and has stood the test of time. This monograph aims to explain the success of these methods, both in theory, for which we cover foundational nonasymptotic statistical guarantees on nearest-neighbor-based regression and classification, and in practice, for which we gather prominent methods for approximate nearest neighbor search that have been essential to scaling prediction systems reliant on nearest neighbor analysis to handle massive datasets. Furthermore, we discuss connections to learning distances for use with nearest neighbor methods, including how random decision trees and ensemble methods learn nearest neighbor structure, as well as recent developments in crowdsourcing and graphons. In terms of theory, our focus is on nonasymptotic statistical guarantees, which we state in the form of how many training data and what algorithm parameters ensure that a nearest neighbor prediction method achieves a user-specified error tolerance. We begin with the most general of such results for nearest neighbor and related kernel regression and classification in general metric spaces. In such settings in which we assume very little structure, what enables successful prediction is smoothness in the function being estimated for regression, and a low probability of landing near the decision boundary for classification. In practice, these conditions could be difficult to verify for a real dataset. We then cover recent guarantees on nearest neighbor prediction in the three case studies of time series forecasting, recommending products to people over time, and delineating human organs in medical images by looking at image patches. In these case studies, clustering structure enables successful prediction.

diagonal sub-gaussian mixture model, kernel time sery classifier, theoretical guarantee, (15 more...)

arXiv.org Machine Learning

2502.159

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Africa > Middle East > Egypt > Cairo Governorate > Cairo (0.04)

Genre:

Overview (1.00)
Research Report (0.81)

Industry:

Leisure & Entertainment (1.00)
Education (0.92)
Health & Medicine > Diagnostic Medicine > Imaging (0.87)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Improving Similar Case Retrieval Ranking Performance By Revisiting RankSVM

Liu, Yuqi, Zheng, Yan

arXiv.org Artificial IntelligenceFeb-16-2025

Given the rapid development of Legal AI, a lot of attention has been paid to one of the most important legal AI tasks--similar case retrieval, especially with language models to use. In our paper, however, we try to improve the ranking performance of current models from the perspective of learning to rank instead of language models. Specifically, we conduct experiments using a pairwise method--RankSVM as the classifier to substitute a fully connected layer, combined with commonly used language models on similar case retrieval datasets LeCaRDv1 and LeCaRDv2. We finally come to the conclusion that RankSVM could generally help improve the retrieval performance on the LeCaRDv1 and LeCaRDv2 datasets compared with original classifiers by optimizing the precise ranking. It could also help mitigate overfitting owing to class imbalance. Our code is available in https://github.com/liuyuqi123study/RankSVM_for_SLR

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.11131

Country:

North America > United States (0.47)
Asia > China (0.46)

Genre: Research Report (0.84)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (0.84)

Add feedback

Near-optimal sample compression for nearest neighbors

Lee-Ad Gottlieb, Aryeh Kontorovich, Pinhas Nisnevitch

Neural Information Processing SystemsFeb-12-2025, 00:51:43 GMT

We present the first sample compression algorithm for nearest neighbors with nontrivial performance guarantees. We complement these guarantees by demonstrating almost matching hardness lower bounds, which show that our bound is nearly optimal. Our result yields new insight into margin-based nearest neighbor classification in metric spaces and allows us to significantly sharpen and simplify existing bounds. Some encouraging empirical results are also presented.

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Israel > Southern District > Beer-Sheva (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.47)

Add feedback

Review for NeurIPS paper: HM-ANN: Efficient Billion-Point Nearest Neighbor Search on Heterogeneous Memory

Neural Information Processing SystemsFeb-11-2025, 22:47:13 GMT

The paper makes an inaccurate claim about the presence of billion-scale ANNS solutions. The performance gain of the proposed HM-ANN algorithm seems marginal when considering its learning curve in practice. The experiments do not evaluate the performance of data fetching. So it is hard to conclude that the proposed HM-ANN achieves better utilization of HM. The paper claims that the proposed HM-ANN is the first billion-scale ANNS solution on a single machine, without using compression (see the last paragraph of Introduction).

efficient billion-point nearest neighbor search, hm-ann, optimization, (9 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.40)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.40)

Add feedback