AITopics | Memory-Based Learning

Collaborating Authors

Memory-Based Learning

[Sometimes called Case-Based Reasoning or CBR]
"At the highest level of generality, a general CBR cycle may be described by the following four processes: 1. RETRIEVE the most similar case or cases. 2. REUSE the information and knowledge in that case to solve the problem. 3. REVISE the proposed solution. 4. RETAIN the parts of this experience likely to be useful for future problem solving "– from Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches. By A. Aamodt and E. Plaza. (1994)

News Overviews Instructional Materials AI-Alerts Classics

Analysis of the Memorization and Generalization Capabilities of AI Agents: Are Continual Learners Robust?

Kim, Minsu, Saad, Walid

arXiv.org Artificial IntelligenceJan-10-2024

In continual learning (CL), an AI agent (e.g., autonomous vehicles or robotics) learns from non-stationary data streams under dynamic environments. For the practical deployment of such applications, it is important to guarantee robustness to unseen environments while maintaining past experiences. In this paper, a novel CL framework is proposed to achieve robust generalization to dynamic environments while retaining past knowledge. The considered CL agent uses a capacity-limited memory to save previously observed environmental information to mitigate forgetting issues. Then, data points are sampled from the memory to estimate the distribution of risks over environmental change so as to obtain predictors that are robust with unseen changes. The generalization and memorization performance of the proposed framework are theoretically analyzed. This analysis showcases the tradeoff between memorization and generalization with the memory size. Experiments show that the proposed algorithm outperforms memory-based CL baselines across all environments while significantly improving the generalization performance on unseen target environments.

ai agent, continual learner robust, memorization and generalization capability

arXiv.org Artificial Intelligence

2309.10149

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.80)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.60)
Information Technology > Artificial Intelligence > Robots (0.53)

Add feedback

On Memorization and Privacy Risks of Sharpness Aware Minimization

Kim, Young In, Agrawal, Pratiksha, Royset, Johannes O., Khanna, Rajiv

arXiv.org Artificial IntelligenceJan-3-2024

In many recent works, there is an increased focus on designing algorithms that seek flatter optima for neural network loss optimization as there is empirical evidence that it leads to better generalization performance in many datasets. We define a new metric that helps us identify which data points specifically do algorithms seeking flatter optima do better when compared to vanilla SGD. We find that the generalization gains achieved by Sharpness Aware Minimization (SAM) are particularly pronounced for atypical data points, which necessitate memorization. This insight helps us unearth higher privacy risks associated with SAM, which we verify through exhaustive empirical evaluations. Finally, we propose mitigation strategies to achieve a more desirable accuracy vs privacy tradeoff. There have been considerable amount of recent works that explore loss optimization that searches for flatter optima (Norton & Royset, 2021; Foret et al., 2020; Wu et al., 2020; Kim et al., 2022; Du et al., 2022; Kwon et al., 2021). Flatness here measures how similar the loss value is for weight perturbations of certain degree around the optima. Significant empirical evidence has demonstrated that methods exploiting flatter optima tend to enjoy better generalization performance. While there have been works on explaining this improvement, these studies look at test accuracy as a monolith, and do not scrutinize on which specific test data points these performance gains come from, and what characterizes these points. In this work, our goal is to bridge this gap through the concept of memorization. Overparamterized neural networks are powerful models capable of achieving close to zero training loss for many datasets. A key insight for this behavior stems from distinguishing'learning' from'memorization' (Feldman, 2020; Feldman & Zhang, 2020). Learning here refers to the classical process of compressing the training data into a model that is further used for predictive downstream task.

accuracy, memorization, training data, (16 more...)

arXiv.org Artificial Intelligence

2310.00488

Country:

North America > United States > Texas (0.05)
North America > Canada > Ontario > Toronto (0.04)
North America > United States > California > Monterey County > Monterey (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Add feedback

Keep the Faith: Faithful Explanations in Convolutional Neural Networks for Case-Based Reasoning

Wolf, Tom Nuno, Bongratz, Fabian, Rickmann, Anne-Marie, Pölsterl, Sebastian, Wachinger, Christian

arXiv.org Artificial IntelligenceDec-19-2023

Explaining predictions of black-box neural networks is crucial when applied to decision-critical tasks. Thus, attribution maps are commonly used to identify important image regions, despite prior work showing that humans prefer explanations based on similar examples. To this end, ProtoPNet learns a set of class-representative feature vectors (prototypes) for case-based reasoning. During inference, similarities of latent features to prototypes are linearly classified to form predictions and attribution maps are provided to explain the similarity. In this work, we evaluate whether architectures for case-based reasoning fulfill established axioms required for faithful explanations using the example of ProtoPNet. We show that such architectures allow the extraction of faithful explanations. However, we prove that the attribution maps used to explain the similarities violate the axioms. We propose a new procedure to extract explanations for trained ProtoPNets, named ProtoPFaith. Conceptually, these explanations are Shapley values, calculated on the similarity scores of each prototype. They allow to faithfully answer which prototypes are present in an unseen image and quantify each pixel's contribution to that presence, thereby complying with all axioms. The theoretical violations of ProtoPNet manifest in our experiments on three datasets (CUB-200-2011, Stanford Dogs, RSNA) and five architectures (ConvNet, ResNet, ResNet50, WideResNet50, ResNeXt50). Our experiments show a qualitative difference between the explanations given by ProtoPNet and ProtoPFaith. Additionally, we quantify the explanations with the Area Over the Perturbation Curve, on which ProtoPFaith outperforms ProtoPNet on all experiments by a factor $>10^3$.

explanation, protopnet, prototype, (15 more...)

arXiv.org Artificial Intelligence

2312.09783

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Italy > Marche > Ancona Province > Ancona (0.05)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.70)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Nuclear Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.82)

Add feedback

Do SSL Models Have D\'ej\`a Vu? A Case of Unintended Memorization in Self-supervised Learning

Meehan, Casey, Bordes, Florian, Vincent, Pascal, Chaudhuri, Kamalika, Guo, Chuan

arXiv.org Artificial IntelligenceDec-12-2023

Self-supervised learning (SSL) algorithms can produce useful image representations by learning to associate different parts of natural images with one another. However, when taken to the extreme, SSL models can unintendedly memorize specific parts in individual training samples rather than learning semantically meaningful associations. In this work, we perform a systematic study of the unintended memorization of image-specific information in SSL models -- which we refer to as d\'ej\`a vu memorization. Concretely, we show that given the trained model and a crop of a training image containing only the background (e.g., water, sky, grass), it is possible to infer the foreground object with high accuracy or even visually reconstruct it. Furthermore, we show that d\'ej\`a vu memorization is common to different SSL algorithms, is exacerbated by certain design choices, and cannot be detected by conventional techniques for evaluating representation quality. Our study of d\'ej\`a vu memorization reveals previously unknown privacy risks in SSL models, as well as suggests potential practical mitigation strategies. Code is available at https://github.com/facebookresearch/DejaVu.

foreground, memorization, vu memorization, (15 more...)

arXiv.org Artificial Intelligence

2304.1385

Country:

North America > United States (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.84)

Add feedback

Understanding (Un)Intended Memorization in Text-to-Image Generative Models

Naseh, Ali, Roh, Jaechul, Houmansadr, Amir

arXiv.org Artificial IntelligenceDec-6-2023

Multimodal machine learning, especially text-to-image models like Stable Diffusion and DALL-E 3, has gained significance for transforming text into detailed images. Despite their growing use and remarkable generative capabilities, there is a pressing need for a detailed examination of these models' behavior, particularly with respect to memorization. Historically, memorization in machine learning has been context-dependent, with diverse definitions emerging from classification tasks to complex models like Large Language Models (LLMs) and Diffusion models. Yet, a definitive concept of memorization that aligns with the intricacies of text-to-image synthesis remains elusive. This understanding is vital as memorization poses privacy risks yet is essential for meeting user expectations, especially when generating representations of underrepresented entities. In this paper, we introduce a specialized definition of memorization tailored to text-to-image models, categorizing it into three distinct types according to user expectations. We closely examine the subtle distinctions between intended and unintended memorization, emphasizing the importance of balancing user privacy with the generative quality of the model outputs. Using the Stable Diffusion model, we offer examples to validate our memorization definitions and clarify their application.

intended memorization, memorization, stable diffusion, (13 more...)

arXiv.org Artificial Intelligence

2312.0755

Country:

Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.07)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.07)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Add feedback

Memory Triggers: Unveiling Memorization in Text-To-Image Generative Models through Word-Level Duplication

Naseh, Ali, Roh, Jaechul, Houmansadr, Amir

arXiv.org Artificial IntelligenceDec-6-2023

Diffusion-based models, such as the Stable Diffusion model, have revolutionized text-to-image synthesis with their ability to produce high-quality, high-resolution images. These advancements have prompted significant progress in image generation and editing tasks. However, these models also raise concerns due to their tendency to memorize and potentially replicate exact training samples, posing privacy risks and enabling adversarial attacks. Duplication in training datasets is recognized as a major factor contributing to memorization, and various forms of memorization have been studied so far. This paper focuses on two distinct and underexplored types of duplication that lead to replication during inference in diffusion-based models, particularly in the Stable Diffusion model. We delve into these lesser-studied duplication phenomena and their implications through two case studies, aiming to contribute to the safer and more responsible use of generative models in various applications.

dataset, duplication, memorization, (12 more...)

arXiv.org Artificial Intelligence

2312.03692

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.04)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.87)

Add feedback

Machine Reading Comprehension using Case-based Reasoning

Thai, Dung, Agarwal, Dhruv, Chaudhary, Mudit, Zhao, Wenlong, Das, Rajarshi, Zaheer, Manzil, Lee, Jay-Yoon, Hajishirzi, Hannaneh, McCallum, Andrew

arXiv.org Artificial IntelligenceDec-5-2023

We present an accurate and interpretable method for answer extraction in machine reading comprehension that is reminiscent of case-based reasoning (CBR) from classical AI. Our method (CBR-MRC) builds upon the hypothesis that contextualized answers to similar questions share semantic similarities with each other. Given a test question, CBR-MRC first retrieves a set of similar cases from a nonparametric memory and then predicts an answer by selecting the span in the test context that is most similar to the contextualized representations of answers in the retrieved cases. The semi-parametric nature of our approach allows it to attribute a prediction to the specific set of evidence cases, making it a desirable choice for building reliable and debuggable QA systems. We show that CBR-MRC provides high accuracy comparable with large reader models and outperforms baselines by 11.5 and 8.4 EM on NaturalQuestions and NewsQA, respectively. Further, we demonstrate the ability of CBR-MRC in identifying not just the correct answer tokens but also the span with the most relevant supporting evidence. Lastly, we observe that contexts for certain question types show higher lexical diversity than others and find that CBR-MRC is robust to these variations while performance using fully-parametric methods drops.

artifact, justification, section number, (7 more...)

arXiv.org Artificial Intelligence

2305.14815

Genre: Research Report (0.40)

Industry: Education > Assessment & Standards > Student Performance (0.60)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)

Add feedback

Case Repositories: Towards Case-Based Reasoning for AI Alignment

Feng, K. J. Kevin, Chen, Quan Ze, Cheong, Inyoung, Xia, King, Zhang, Amy X.

arXiv.org Artificial IntelligenceNov-26-2023

Case studies commonly form the pedagogical backbone in law, ethics, and many other domains that face complex and ambiguous societal questions informed by human values. Similar complexities and ambiguities arise when we consider how AI should be aligned in practice: when faced with vast quantities of diverse (and sometimes conflicting) values from different individuals and communities, with whose values is AI to align, and how should AI do so? We propose a complementary approach to constitutional AI alignment, grounded in ideas from case-based reasoning (CBR), that focuses on the construction of policies through judgments on a set of cases. We present a process to assemble such a case repository by: 1) gathering a set of ``seed'' cases -- questions one may ask an AI system -- in a particular domain, 2) eliciting domain-specific key dimensions for cases through workshops with domain experts, 3) using LLMs to generate variations of cases not seen in the wild, and 4) engaging with the public to judge and improve cases. We then discuss how such a case repository could assist in AI alignment, both through directly acting as precedents to ground acceptable behaviors, and as a medium for individuals and communities to engage in moral reasoning around AI.

case repository, dimension, reasoning, (14 more...)

arXiv.org Artificial Intelligence

2311.10934

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Maintenance of Plan Libraries for Case-Based Planning: Offline and Online Policies

Gerevini, Alfonso Emilio | Saetti, Alessandro (a:1:{s:5:"en_US";s:21:"University of Brescia";}) | Serina, Ivan | Loreggia, Andrea | Putelli, Luca | Roubickova, Anna

Journal of Artificial Intelligence ResearchNov-16-2023

Case-based planning is an approach to planning where previous planning experience provides guidance to solving new problems. Such a guidance can be extremely useful, or even necessary, when the new problem is very hard to solve, or the stored previous experience is highly valuable, because, e.g., it was provided or validated by human experts, and the system should try to reuse it as much as possible. To do so, a case-based planning system stores in a library previous planning experience in the form of already encountered problems and their solutions. The quality of such a plan library critically influences the performance of the planner, and therefore it needs to be carefully designed and created. For this reason, it is also important to update the library during the lifetime of the system, as the type of problems being addressed may evolve or differ from the ones the library was originally designed for. Moreover, like in general case-based reasoning, the library needs to be maintained at a manageable size, otherwise the computational cost of querying it grows excessively, making the entire approach ineffective. In this paper, we formally define the problem of maintaining a library of cases, discuss which criteria should drive the maintenance, study the computational complexity of the maintenance problem, and propose offline techniques to reduce an oversized library that optimize different criteria. Moreover, we introduce a complementary online approach that attempts to limit the growth of the library, and we consider the combination of offline and online techniques to ensure the best performance of the case-based planner. Finally, we experimentally show the practical effectiveness of the offline and online methods for reducing the library.

case base, library, plan library, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.14797

AI Access Foundation

14797

Journal of Artificial Intelligence Research

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Florida > Orange County > Orlando (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling > Plan Recognition (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback

Lexical Repetitions Lead to Rote Learning: Unveiling the Impact of Lexical Overlap in Train and Test Reference Summaries

Choubey, Prafulla Kumar, Fabbri, Alexander R., Xiong, Caiming, Wu, Chien-Sheng

arXiv.org Artificial IntelligenceNov-15-2023

Ideal summarization models should generalize to novel summary-worthy content without remembering reference training summaries by rote. However, a single average performance score on the entire test set is inadequate in determining such model competencies. We propose a fine-grained evaluation protocol by partitioning a test set based on the lexical similarity of reference test summaries with training summaries. We observe up to a 5x (1.2x) difference in ROUGE-2 (entity recall) scores between the subsets with the lowest and highest similarity. Next, we show that such training repetitions also make a model vulnerable to rote learning, reproducing data artifacts such as factual errors, especially when reference test summaries are lexically close to training summaries. Consequently, we propose to limit lexical repetitions in training summaries during both supervised fine-tuning and likelihood calibration stages to improve the performance on novel test cases while retaining average performance. Our automatic and human evaluations on novel test subsets and recent news articles show that limiting lexical repetitions in training summaries can prevent rote learning and improve generalization.

dataset, subset, training summary, (14 more...)

arXiv.org Artificial Intelligence

2311.09458

Country:

Europe > Russia (0.05)
Asia > Russia (0.05)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(8 more...)

Genre: Research Report (0.82)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.81)

Add feedback