AITopics | Case-Based Reasoning

Collaborating Authors

Case-Based Reasoning

"At the highest level of generality, a general CBR cycle may be described by the following four processes:

RETRIEVE the most similar case or cases
REUSE the information and knowledge in that case to solve the problem
REVISE the proposed solution
RETAIN the parts of this experience likely to be useful for future problem solving "

– Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches. Agnar Aamodt & Enric Plaza. AI Communications. IOS Press, Vol. 7: 1, pp. 39-59.

News Overviews Instructional Materials AI-Alerts Classics

DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node

Neural Information Processing SystemsOct-9-2024, 11:54:47 GMT

Current state-of-the-art approximate nearest neighbor search (ANNS) algorithms generate indices that must be stored in main memory for fast high-recall search. This makes them expensive and limits the size of the dataset. We present a new graph-based indexing and search system called DiskANN that can index, store, and search a billion point database on a single workstation with just 64GB RAM and an inexpensive solid-state drive (SSD). Contrary to current wisdom, we demonstrate that the SSD-based indices built by DiskANN can meet all three desiderata for large-scale ANNS: high-recall, low query latency and high density (points indexed per node). On the billion point SIFT1B bigann dataset, DiskANN serves 5000 queries a second with 3ms mean latency and 95% 1-recall@1 on a 16 core machine, where state-of-the-art billion-point ANNS algorithms with similar memory footprint like FAISS and IVFOADC G P plateau at around 50% 1-recall@1.

artificial intelligence, information retrieval, natural language, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.64)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.64)

Add feedback

Statistical Guarantees of Distributed Nearest Neighbor Classification

Neural Information Processing SystemsOct-9-2024, 09:51:26 GMT

Nearest neighbor is a popular nonparametric method for classification and regression with many appealing properties. In the big data era, the sheer volume and spatial/temporal disparity of big data may prohibit centrally processing and storing the data. This has imposed considerable hurdle for nearest neighbor predictions since the entire training data must be memorized. One effective way to overcome this issue is the distributed learning framework. Through majority voting, the distributed nearest neighbor classifier achieves the same rate of convergence as its oracle version in terms of the regret, up to a multiplicative constant that depends solely on the data dimension.

nearest neighbor classification, nearest neighbor classifier, statistical guarantee, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Reviews: A Theory-Based Evaluation of Nearest Neighbor Models Put Into Practice

Neural Information Processing SystemsOct-8-2024, 21:06:33 GMT

SUMMARY: The paper studies the problem of testing whether a graph is epsilon-far from a kNN graph, where epsilon-far means that at least epsilon-fraction of the edges need to be changed in order to make the graph a kNN graph. The paper presents an algorithm with an upper bound of O(\sqrt{n}*k 2/\epsilon 2) number of edge/vertex queries and a lower bound of \Omega(\sqrt{n}). I guess "\omega" should be "p" 2. The result of Proposition 12 is interesting which bounds the number of points that can be in the kNN set of a particular point. The bound is k times the known bound for 1-NN. I wonder if this could be tightened somehow.

graph, nearest neighbor model, theory-based evaluation, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.51)

Add feedback

Reviews: Neural Nearest Neighbors Networks

Neural Information Processing SystemsOct-8-2024, 09:12:12 GMT

Update: I am somewhat convinced by the rebuttal. I will increase my rating, although I think that the quantitative improvements are quite marginal. Summary: Authors propose a neural network layer based on attention-like mechanism (a "non-local method") and apply it for the problem of image restoration. The main idea is to substitute "hard" k-NN selection with a continuous approximation: which is essentially a weighted average based on the pairwise distances between the predicted embeddings (very similar to mean-shift update rule). Although the paper is clearly written, the significance of the technical contributions is doubtful (see weaknesses), thus the overall score is marginally below the acceptance threshold.

image restoration, neural nearest neighbor network, review

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

Add feedback

Reviews: The Nearest Neighbor Information Estimator is Adaptively Near Minimax Rate-Optimal

Neural Information Processing SystemsOct-8-2024, 08:38:07 GMT

Paper 1614 This paper studies the Kozachenko-Leonenko estimator for the differential entropy of a multivariate smooth density that satisfy a periodic boundary condition; an equivalent way to state the condition is to let the density be defined on the [0,1] d-torus. The authors show that the K-L estimator achieves a rate of convergence that is optimal up to poly-log factors. The result is interesting and the paper is well-written. I could not check the entirety of the proof but the parts I checked are correct. I recommend that the paper be accepted.

adaptively, minimax rate-optimal, nearest neighbor information estimator, (5 more...)

Neural Information Processing Systems

Genre: Research Report (0.62)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.40)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.40)

Add feedback

Reviews: Nearest-Neighbor Sample Compression: Efficiency, Consistency, Infinite Dimensions

Neural Information Processing SystemsOct-8-2024, 04:09:14 GMT

This work develops a compression-based algorithm for multiclass learning; the authors claim the method is both efficient and strongly Bayes-consistent in spaces of finite doubling dimension. They also provide one example of a space of infinite doubling dimension with a particular measure for which their method is weakly Bayes consistent, whereas the same construction leads to inconsistency of k-NN rules. Overall, I think this paper is technically strong and seems to develop interesting results, but I have a few concerns about the significance of this paper which I will discuss below. If the authors can address these concerns, I would support this paper for acceptance. Detailed comments: I did not check the proofs in the appendix in detail but the main ideas appear to be correct.

bayes-consistent, dimension, nearest-neighbor sample compression, (11 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.81)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.41)

Add feedback

Tensor-Train Point Cloud Compression and Efficient Approximate Nearest-Neighbor Search

Novikov, Georgii, Gneushev, Alexander, Kadeishvili, Alexey, Oseledets, Ivan

arXiv.org Artificial IntelligenceOct-6-2024

Nearest-neighbor search in large vector databases is crucial for various machine learning applications. This paper introduces a novel method using tensor-train (TT) low-rank tensor decomposition to efficiently represent point clouds and enable fast approximate nearest-neighbor searches. We propose a probabilistic interpretation and utilize density estimation losses like Sliced Wasserstein to train TT decompositions, resulting in robust point cloud compression. We reveal an inherent hierarchical structure within TT point clouds, facilitating efficient approximate nearest-neighbor searches. In our paper, we provide detailed insights into the methodology and conduct comprehensive comparisons with existing methods. We demonstrate its effectiveness in various scenarios, including out-of-distribution (OOD) detection problems and approximate nearest-neighbor (ANN) search tasks.

cloud, point cloud, tt point cloud, (15 more...)

arXiv.org Artificial Intelligence

2410.04462

Country:

Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.05)
Asia > Russia (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(5 more...)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Verbalized Graph Representation Learning: A Fully Interpretable Graph Model Based on Large Language Models Throughout the Entire Process

Ji, Xingyu, Liu, Jiale, Li, Lu, Wang, Maojun, Zhang, Zeyu

arXiv.org Artificial IntelligenceOct-4-2024

Representation learning on text-attributed graphs (TAGs) has attracted significant interest due to its wide-ranging real-world applications, particularly through Graph Neural Networks (GNNs). Traditional GNN methods focus on encoding the structural information of graphs, often using shallow text embeddings for node or edge attributes. This limits the model to understand the rich semantic information in the data and its reasoning ability for complex downstream tasks, while also lacking interpretability. With the rise of large language models (LLMs), an increasing number of studies are combining them with GNNs for graph representation learning and downstream tasks. While these approaches effectively leverage the rich semantic information in TAGs datasets, their main drawback is that they are only partially interpretable, which limits their application in critical fields. In this paper, we propose a verbalized graph representation learning (VGRL) method which is fully interpretable. In contrast to traditional graph machine learning models, which are usually optimized within a continuous parameter space, VGRL constrains this parameter space to be text description which ensures complete interpretability throughout the entire process, making it easier for users to understand and trust the decisions of the model. We conduct several studies to empirically evaluate the effectiveness of VGRL and we believe these method can serve as a stepping stone in graph representation learning.

algorithm, category, neural network, (15 more...)

arXiv.org Artificial Intelligence

2410.01457

Country:

Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China > Hubei Province (0.04)

Genre: Research Report > Promising Solution (0.46)

Industry: Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

Nearest-Neighbor Sample Compression: Efficiency, Consistency, Infinite Dimensions

Aryeh Kontorovich, Sivan Sabato, Roi Weiss

Neural Information Processing SystemsOct-3-2024, 22:50:32 GMT

We examine the Bayes-consistency of a recently proposed 1-nearest-neighbor-based multiclass learning algorithm. This algorithm is derived from sample compression bounds and enjoys the statistical advantages of tight, fully empirical generalization bounds, as well as the algorithmic advantages of a faster runtime and memory savings. We prove that this algorithm is strongly Bayes-consistent in metric spaces with finite doubling dimension -- the first consistency result for an efficient nearest-neighbor sample compression scheme. Rather surprisingly, we discover that this algorithm continues to be Bayes-consistent even in a certain infinitedimensional setting, in which the basic measure-theoretic conditions on which classic consistency proofs hinge are violated. This is all the more surprising, since it is known that k-NN is not Bayes-consistent in this setting. We pose several challenging open problems for future research.

algorithm, bayes-consistent, metric space, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.46)

Add feedback

CaBRNet, an open-source library for developing and evaluating Case-Based Reasoning Models

Xu-Darme, Romain, Varasse, Aymeric, Grastien, Alban, Girard, Julien, Chihani, Zakaria

arXiv.org Artificial IntelligenceSep-25-2024

As a reflection of the social and ethical concerns related to the increasing use of AI-based systems in modern society, the field of explainable AI (XAI) has gained tremendous momentum in recent years. XAI mainly consists of two complementary avenues of research that aim at shedding some light into the inner-workings of complex ML models. On the one hand, post-hoc explanation methods apply to existing models that have often been trained with the sole purpose of accomplishing a given task as efficiently as possible (e.g., accuracy in a classification task). On the other hand, self-explainable models are designed and trained to produce their own explanations along with their decision. The appeal of selfexplainable models resides in the fact that rather than using an approximation (i.e., a post-hoc explanation method) to understand a complex model, it is better to directly enforce a simpler (and more understandable) decision-making process during the design and training of the ML model, provided that such a model would exhibit an acceptable level of performance.

architecture, cabrnet, reproducibility, (17 more...)

arXiv.org Artificial Intelligence

2409.16693

Country: Europe > France (0.04)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)

Add feedback