AITopics | Case-Based Reasoning

Collaborating Authors

Case-Based Reasoning

"At the highest level of generality, a general CBR cycle may be described by the following four processes:

RETRIEVE the most similar case or cases
REUSE the information and knowledge in that case to solve the problem
REVISE the proposed solution
RETAIN the parts of this experience likely to be useful for future problem solving "

– Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches. Agnar Aamodt & Enric Plaza. AI Communications. IOS Press, Vol. 7: 1, pp. 39-59.

News Overviews Instructional Materials AI-Alerts Classics

Label Noise Robustness for Domain-Agnostic Fair Corrections via Nearest Neighbors Label Spreading

Neural Information Processing SystemsMar-27-2025, 16:08:00 GMT

Last-layer retraining methods have emerged as an efficient framework for correcting existing base models. Within this framework, several methods have been proposed to deal with correcting models for subgroup fairness with and without group membership information. Importantly, prior work has demonstrated that many methods are susceptible to noisy labels. To this end, we propose a drop-in correction for label noise in last-layer retraining, and demonstrate that it achieves state-ofthe-art worst-group accuracy for a broad range of symmetric label noise and across a wide variety of datasets exhibiting spurious correlations. Our proposed approach uses label spreading on a latent nearest neighbors graph and has minimal computational overhead compared to existing methods.

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

The Nearest Neighbor Information Estimator is Adaptively Near Minimax Rate-Optimal

Jiantao Jiao, Weihao Gao, Yanjun Han

Neural Information Processing SystemsMar-27-2025, 03:51:20 GMT

We analyze the Kozachenko-Leonenko (KL) fixed k-nearest neighbor estimator for the differential entropy. We obtain the first uniform upper bound on its performance for any fixed k over Hölder balls on a torus without assuming any conditions on how close the density could be from zero. Accompanying a recent minimax lower bound over the Hölder ball, we show that the KL estimator for any fixed k is achieving the minimax rates up to logarithmic factors without cognizance of the smoothness parameter s of the Hölder ball for s (0, 2] and arbitrary dimension d, rendering it the first estimator that provably satisfies this property.

artificial intelligence, estimator, machine learning, (13 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.82)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.36)

Add feedback

Rates of Convergence for Large-scale Nearest Neighbor Classification

Xingye Qiao, Jiexin Duan, Guang Cheng

Neural Information Processing SystemsMar-27-2025, 02:07:41 GMT

Nearest neighbor is a popular class of classification methods with many desirable properties. For a large data set which cannot be loaded into the memory of a single machine due to computation, communication, privacy, or ownership limitations, we consider the divide and conquer scheme: the entire data set is divided into small subsamples, on which nearest neighbor predictions are made, and then a final decision is reached by aggregating the predictions on subsamples by majority voting. We name this method the big Nearest Neighbor (bigNN) classifier, and provide its rates of convergence under minimal assumptions, in terms of both the excess risk and the classification instability, which are proven to be the same rates as the oracle nearest neighbor classifier and cannot be improved. To significantly reduce the prediction time that is required for achieving the optimal rate, we also consider the pre-training acceleration technique applied to the bigNN method, with proven convergence rate. We find that in the distributed setting, the optimal choice of the neighbor k should scale with both the total sample size and the number of partitions, and there is a theoretical upper limit for the latter. Numerical studies have verified the theoretical findings.

artificial intelligence, classifier, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Indiana (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Learning Nearest Neighbor Graphs from Noisy Distance Samples

Blake Mason, Ardhendu Tripathy, Robert Nowak

Neural Information Processing SystemsMar-26-2025, 07:40:38 GMT

We consider the problem of learning the nearest neighbor graph of a dataset of n items. The metric is unknown, but we can query an oracle to obtain a noisy estimate of the distance between any pair of items. This framework applies to problem domains where one wants to learn people's preferences from responses commonly modeled as noisy distance judgments. In this paper, we propose an active algorithm to find the graph with high probability and analyze its query complexity. In contrast to existing work that forces Euclidean structure, our method is valid for general metrics, assuming only symmetry and the triangle inequality.

artificial intelligence, machine learning, triangle inequality, (12 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.66)

Add feedback

Optimizing Case-Based Reasoning System for Functional Test Script Generation with Large Language Models

Guo, Siyuan, Liu, Huiwu, Chen, Xiaolong, Xie, Yuming, Zhang, Liang, Han, Tao, Chen, Hechang, Chang, Yi, Wang, Jun

arXiv.org Artificial IntelligenceMar-26-2025

In this work, we explore the potential of large language models (LLMs) for generating functional test scripts, which necessitates understanding the dynamically evolving code structure of the target software. To achieve this, we propose a case-based reasoning (CBR) system utilizing a 4R cycle (i.e., retrieve, reuse, revise, and retain), which maintains and leverages a case bank of test intent descriptions and corresponding test scripts to facilitate LLMs for test script generation. To improve user experience further, we introduce Re4, an optimization method for the CBR system, comprising reranking-based retrieval finetuning and reinforced reuse finetuning. Specifically, we first identify positive examples with high semantic and script similarity, providing reliable pseudo-labels for finetuning the retriever model without costly labeling. Then, we apply supervised finetuning, followed by a reinforcement learning finetuning stage, to align LLMs with our production scenarios, ensuring the faithful reuse of retrieved cases. Extensive experimental results on two product development units from Huawei Datacom demonstrate the superiority of the proposed CBR+Re4. Notably, we also show that the proposed Re4 method can help alleviate the repetitive generation issues with LLMs.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2503.20576

Country: Asia > China (0.29)

Genre:

Research Report (1.00)
Overview (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Rand-NSG: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node

Suhas Jayaram Subramanya, Fnu Devvrit, Harsha Vardhan Simhadri, Ravishankar Krishnawamy, Rohan Kadekodi

Neural Information Processing SystemsMar-22-2025, 13:36:45 GMT

Current state-of-the-art approximate nearest neighbor search (ANNS) algorithms generate indices that must be stored in main memory for fast high-recall search. This makes them expensive and limits the size of the dataset. We present a new graph-based indexing and search system called DiskANN that can index, store, and search a billion point database on a single workstation with just 64GB RAM and an inexpensive solid-state drive (SSD). Contrary to current wisdom, we demonstrate that the SSD-based indices built by DiskANN can meet all three desiderata for large-scale ANNS: high-recall, low query latency and high density (points indexed per node). On the billion point SIFT1B bigann dataset, DiskANN serves > 5000 queries a second with < 3ms mean latency and 95%+ 1-recall@1 on a 16 core machine, where state-of-the-art billion-point ANNS algorithms with similar memory footprint like FAISS [18] and IVFOADC+G+P [8] plateau at around 50% 1-recall@1. Alternately, in the high recall regime, DiskANN can index and serve 5 10x more points per node compared to state-of-the-art graphbased methods such as HNSW [21] and NSG [13]. Finally, as part of our overall DiskANN system, we introduce Vamana, a new graph-based ANNS index that is more versatile than the existing graph indices even for in-memory indices.

information retrieval, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States > Texas > Travis County > Austin (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.63)

Add feedback

Online Consistency of the Nearest Neighbor Rule Geelon So Department of Computer Science Department of Computer Science UC San Diego

Neural Information Processing SystemsMar-22-2025, 07:17:34 GMT

In the realizable online setting, a learner is tasked with making predictions for a stream of instances, where the correct answer is revealed after each prediction. A learning rule is online consistent if its mistake rate eventually vanishes. The nearest neighbor rule (Fix and Hodges, 1951) is a fundamental prediction strategy, but it is only known to be consistent under strong statistical or geometric assumptions--the instances come i.i.d. or the label classes are well-separated. We prove online consistency for all measurable functions in doubling metric spaces under the mild assumption that the instances are generated by a process that is uniformly absolutely continuous with respect to a finite, upper doubling measure.

artificial intelligence, machine learning, nearest neighbor rule, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California > San Diego County > San Diego (0.40)

Genre: Research Report > Experimental Study (0.93)

Industry: Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.66)

Add feedback

the questions raised by each reviewer separately. Layer Choice (R2,R3) The layer can be chosen depending on the size of the nearest neighbor patch the user would

Neural Information Processing SystemsMar-21-2025, 16:29:46 GMT

We thank all reviewers for their constructive comments. We fixed the architecture g in our previous experiments to a two layer neural network. We discussed how to select the layer choice and its impact above. The computational cost is low since the pretrained model is fixed, and we only optimize for g and c. We didn't test on Imagenet since we can't visualize results for all 1000 classes.

artificial intelligence, nearest neighbor patch, reviewer separately, (15 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (0.35)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.42)

Add feedback

Interpretable Machine Learning for Oral Lesion Diagnosis through Prototypical Instances Identification

Cascione, Alessio, Setzu, Mattia, Galatolo, Federico A., Cimino, Mario G. C. A., Guidotti, Riccardo

arXiv.org Artificial IntelligenceMar-21-2025

Decision-making processes in healthcare can be highly complex and challenging. Machine Learning tools offer significant potential to assist in these processes. However, many current methodologies rely on complex models that are not easily interpretable by experts. This underscores the need to develop interpretable models that can provide meaningful support in clinical decision-making. When approaching such tasks, humans typically compare the situation at hand to a few key examples and representative cases imprinted in their memory. Using an approach which selects such exemplary cases and grounds its predictions on them could contribute to obtaining high-performing interpretable solutions to such problems. To this end, we evaluate PivotTree, an interpretable prototype selection model, on an oral lesion detection problem, specifically trying to detect the presence of neoplastic, aphthous and traumatic ulcerated lesions from oral cavity images. We demonstrate the efficacy of using such method in terms of performance and offer a qualitative and quantitative comparison between exemplary cases and ground-truth prototypes selected by experts.

artificial intelligence, machine learning, survey article, (15 more...)

arXiv.org Artificial Intelligence

2503.16938

Country: Europe > Italy (0.28)

Genre:

Overview (0.68)
Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine (0.93)
Health & Medicine > Therapeutic Area > Dermatology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.47)
(2 more...)

Add feedback

SOAR: Improved Indexing for Approximate Nearest Neighbor Search

Neural Information Processing SystemsMar-18-2025, 10:10:28 GMT

This paper introduces SOAR: Spilling with Orthogonality-Amplified Residuals, a novel data indexing technique for approximate nearest neighbor (ANN) search. SOAR extends upon previous approaches to ANN search, such as spill trees, that utilize multiple redundant representations while partitioning the data to reduce the probability of missing a nearest neighbor during search. Rather than training and computing these redundant representations independently, however, SOAR uses an orthogonality-amplified residual loss, which optimizes each representation to compensate for cases where other representations perform poorly. This drastically improves the overall index quality, resulting in state-of-the-art ANN benchmark performance while maintaining fast indexing times and low memory consumption.

artificial intelligence, information retrieval, natural language, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.84)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.51)

Add feedback