AITopics | Memory-Based Learning

Collaborating Authors

Memory-Based Learning

[Sometimes called Case-Based Reasoning or CBR]
"At the highest level of generality, a general CBR cycle may be described by the following four processes: 1. RETRIEVE the most similar case or cases. 2. REUSE the information and knowledge in that case to solve the problem. 3. REVISE the proposed solution. 4. RETAIN the parts of this experience likely to be useful for future problem solving "– from Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches. By A. Aamodt and E. Plaza. (1994)

News Overviews Instructional Materials AI-Alerts Classics

Memorization in deep learning: A survey

Wei, Jiaheng, Zhang, Yanjun, Zhang, Leo Yu, Ding, Ming, Chen, Chao, Ong, Kok-Leong, Zhang, Jun, Xiang, Yang

arXiv.org Artificial IntelligenceJun-6-2024

Deep Learning (DL) powered by Deep Neural Networks (DNNs) has revolutionized various domains, yet understanding the intricacies of DNN decision-making and learning processes remains a significant challenge. Recent investigations have uncovered an interesting memorization phenomenon in which DNNs tend to memorize specific details from examples rather than learning general patterns, affecting model generalization, security, and privacy. This raises critical questions about the nature of generalization in DNNs and their susceptibility to security breaches. In this survey, we present a systematic framework to organize memorization definitions based on the generalization and security/privacy domains and summarize memorization evaluation methods at both the example and model levels. Through a comprehensive literature review, we explore DNN memorization behaviors and their impacts on security and privacy. We also introduce privacy vulnerabilities caused by memorization and the phenomenon of forgetting and explore its connection with memorization. Furthermore, we spotlight various applications leveraging memorization and forgetting mechanisms, including noisy label learning, privacy preservation, and model enhancement. This survey offers the first-in-kind understanding of memorization in DNNs, providing insights into its challenges and opportunities for enhancing AI development while addressing critical ethical concerns.

dataset, memorization, memorization effect, (12 more...)

arXiv.org Artificial Intelligence

2406.0388

Country:

South America > Colombia > Meta Department > Villavicencio (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(4 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)
Research Report > Experimental Study (0.45)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Add feedback

Finding NeMo: Localizing Neurons Responsible For Memorization in Diffusion Models

Hintersdorf, Dominik, Struppek, Lukas, Kersting, Kristian, Dziedzic, Adam, Boenisch, Franziska

arXiv.org Artificial IntelligenceJun-4-2024

Diffusion models (DMs) produce very detailed and high-quality images. Their power results from extensive training on large amounts of data, usually scraped from the internet without proper attribution or consent from content creators. Unfortunately, this practice raises privacy and intellectual property concerns, as DMs can memorize and later reproduce their potentially sensitive or copyrighted training images at inference time. Prior efforts prevent this issue by either changing the input to the diffusion process, thereby preventing the DM from generating memorized samples during inference, or removing the memorized data from training altogether. While those are viable solutions when the DM is developed and deployed in a secure and constantly monitored environment, they hold the risk of adversaries circumventing the safeguards and are not effective when the DM itself is publicly released. To solve the problem, we introduce NeMo, the first method to localize memorization of individual data samples down to the level of neurons in DMs' cross-attention layers. Through our experiments, we make the intriguing finding that in many cases, single neurons are responsible for memorizing particular training samples. By deactivating these memorization neurons, we can avoid the replication of training data at inference time, increase the diversity in the generated outputs, and mitigate the leakage of private and copyrighted data. In this way, our NeMo contributes to a more responsible deployment of DMs.

memorization, memorization neuron, neuron, (13 more...)

arXiv.org Artificial Intelligence

2406.02366

Country:

Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
Asia > North Korea (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Media (0.93)
Leisure & Entertainment (0.67)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

A Multi-Perspective Analysis of Memorization in Large Language Models

Chen, Bowen, Han, Namgi, Miyao, Yusuke

arXiv.org Artificial IntelligenceJun-4-2024

Large Language Models (LLMs), trained on massive corpora with billions of parameters, show unprecedented performance in various fields. Though surprised by their excellent performances, researchers also noticed some special behaviors of those LLMs. One of those behaviors is memorization, in which LLMs can generate the same content used to train them. Though previous research has discussed memorization, the memorization of LLMs still lacks explanation, especially the cause of memorization and the dynamics of generating them. In this research, we comprehensively discussed memorization from various perspectives and extended the discussion scope to not only just the memorized content but also less and unmemorized content. Through various studies, we found that: (1) Through experiments, we revealed the relation of memorization between model size, continuation size, and context size. Further, we showed how unmemorized sentences transition to memorized sentences. (2) Through embedding analysis, we showed the distribution and decoding dynamics across model size in embedding space for sentences with different memorization scores. The n-gram statistics analysis presents d (3) An analysis over n-gram and entropy decoding dynamics discovered a boundary effect when the model starts to generate memorized sentences or unmemorized sentences. (4)We trained a Transformer model to predict the memorization of different models, showing that it is possible to predict memorizations by context.

boundary effect, memorization, model size, (16 more...)

arXiv.org Artificial Intelligence

2405.11577

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California (0.04)
(5 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Add feedback

MemControl: Mitigating Memorization in Medical Diffusion Models via Automated Parameter Selection

Dutt, Raman, Sanchez, Pedro, Bohdal, Ondrej, Tsaftaris, Sotirios A., Hospedales, Timothy

arXiv.org Artificial IntelligenceMay-29-2024

Diffusion models show a remarkable ability in generating images that closely mirror the training distribution. However, these models are prone to training data memorization, leading to significant privacy, ethical, and legal concerns, particularly in sensitive fields such as medical imaging. We hypothesize that memorization is driven by the overparameterization of deep models, suggesting that regularizing model capacity during fine-tuning could be an effective mitigation strategy. Parameter-efficient fine-tuning (PEFT) methods offer a promising approach to capacity control by selectively updating specific parameters. However, finding the optimal subset of learnable parameters that balances generation quality and memorization remains elusive. To address this challenge, we propose a bi-level optimization framework that guides automated parameter selection by utilizing memorization and generation quality metrics as rewards. Our framework successfully identifies the optimal parameter set to be updated to satisfy the generation-memorization tradeoff. We perform our experiments for the specific task of medical image generation and outperform existing state-of-the-art training-time mitigation strategies by fine-tuning as few as 0.019% of model parameters. Furthermore, we show that the strategies learned through our framework are transferable across different datasets and domains. Our proposed framework is scalable to large datasets and agnostic to the choice of reward functions. Finally, we show that our framework can be combined with existing approaches for further memorization mitigation.

diffusion model, fine-tuning, memorization, (15 more...)

arXiv.org Artificial Intelligence

2405.19458

Country:

North America > United States > Virginia (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Add feedback

DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning

Guo, Siyuan, Deng, Cheng, Wen, Ying, Chen, Hechang, Chang, Yi, Wang, Jun

arXiv.org Artificial IntelligenceMay-28-2024

In this work, we investigate the potential of large language models (LLMs) based agents to automate data science tasks, with the goal of comprehending task requirements, then building and training the best-fit machine learning models. Despite their widespread success, existing LLM agents are hindered by generating unreasonable experiment plans within this scenario. To this end, we present DS-Agent, a novel automatic framework that harnesses LLM agent and case-based reasoning (CBR). In the development stage, DS-Agent follows the CBR framework to structure an automatic iteration pipeline, which can flexibly capitalize on the expert knowledge from Kaggle, and facilitate consistent performance improvement through the feedback mechanism. Moreover, DS-Agent implements a low-resource deployment stage with a simplified CBR paradigm to adapt past successful solutions from the development stage for direct code generation, significantly reducing the demand on foundational capabilities of LLMs. Empirically, DS-Agent with GPT-4 achieves 100\% success rate in the development stage, while attaining 36\% improvement on average one pass rate across alternative LLMs in the deployment stage. In both stages, DS-Agent achieves the best rank in performance, costing \$1.60 and \$0.13 per run with GPT-4, respectively. Our data and code are open-sourced at https://github.com/guosyjlu/DS-Agent.

automated data science, submission, torch, (14 more...)

arXiv.org Artificial Intelligence

2402.17453

Country:

Europe > Austria > Vienna (0.14)
Asia > China > Jilin Province (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)

Add feedback

Quantifying In-Context Reasoning Effects and Memorization Effects in LLMs

Lou, Siyu, Chen, Yuntian, Liang, Xiaodan, Lin, Liang, Zhang, Quanshi

arXiv.org Artificial IntelligenceMay-20-2024

In this study, we propose an axiomatic system to define and quantify the precise memorization and in-context reasoning effects used by the large language model (LLM) for language generation. These effects are formulated as non-linear interactions between tokens/words encoded by the LLM. Specifically, the axiomatic system enables us to categorize the memorization effects into foundational memorization effects and chaotic memorization effects, and further classify in-context reasoning effects into enhanced inference patterns, eliminated inference patterns, and reversed inference patterns. Besides, the decomposed effects satisfy the sparsity property and the universal matching property, which mathematically guarantee that the LLM's confidence score can be faithfully decomposed into the memorization effects and in-context reasoning effects. Experiments show that the clear disentanglement of memorization effects and in-context reasoning effects enables a straightforward examination of detailed inference patterns encoded by LLMs.

interaction, memorization effect, reasoning effect, (14 more...)

arXiv.org Artificial Intelligence

2405.1188

Country:

Europe > Greece (0.06)
Asia > China > Zhejiang Province > Ningbo (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Europe > Austria > Vienna (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Add feedback

Case-Based Reasoning Approach for Solving Financial Question Answering

Kim, Yikyung, Lee, Jay-Yoon

arXiv.org Artificial IntelligenceMay-18-2024

Measuring a machine's understanding of human language often involves assessing its reasoning skills, i.e. logical process of deriving answers to questions. While recent language models have shown remarkable proficiency in text based tasks, their efficacy in complex reasoning problems involving heterogeneous information such as text, tables, and numbers remain uncertain. Addressing this gap, FinQA introduced a numerical reasoning dataset for financial documents and simultaneously proposed a program generation approach . Our investigation reveals that half of the errors (48%) stem from incorrect operations being generated. To address this issue, we propose a novel approach to tackle numerical reasoning problems using case based reasoning (CBR), an artificial intelligence paradigm that provides problem solving guidance by offering similar cases (i.e. similar questions and corresponding logical programs). Our model retrieves relevant cases to address a given question, and then generates an answer based on the retrieved cases and contextual information. Through experiments on the FinQA dataset, we demonstrate competitive performance of our approach and additionally show that by expanding case repository, we can help solving complex multi step programs which FinQA showed weakness of.

arxiv preprint arxiv, program generator, reasoning, (12 more...)

arXiv.org Artificial Intelligence

2405.13044

Country:

Asia > South Korea > Seoul > Seoul (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)

Genre: Research Report (0.84)

Industry: Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Exploring prompts to elicit memorization in masked language model-based named entity recognition

Xia, Yuxi, Sedova, Anastasiia, de Araujo, Pedro Henrique Luz, Kougia, Vasiliki, Nußbaumer, Lisa, Roth, Benjamin

arXiv.org Artificial IntelligenceMay-5-2024

This paper focuses on analyzing prompts' impact on detecting the memorization of 6 masked language model-based named entity recognition models. Specifically, we employ a diverse set of 400 automatically generated prompts, and a pairwise dataset where each pair consists of one person's name from the training set and another name out of the set. A prompt completed with a person's name serves as input for getting the model's confidence in predicting this name. Finally, the prompt performance of detecting model memorization is quantified by the percentage of name pairs for which the model has higher confidence for the name from the training set. We show that the performance of different prompts varies by as much as 16 percentage points on the same model, and prompt engineering further increases the gap. Moreover, our experiments demonstrate that prompt performance is model-dependent but does generalize across different name sets. A comprehensive analysis indicates how prompt performance is influenced by prompt properties, contained tokens, and the model's self-attention weights on the prompt.

m-mem score, memorization, ner model, (17 more...)

arXiv.org Artificial Intelligence

2405.03004

Country:

Europe > Austria > Vienna (0.15)
Asia > Middle East > Jordan (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.87)

Add feedback

Quantifying Memorization of Domain-Specific Pre-trained Language Models using Japanese Newspaper and Paywalls

Ishihara, Shotaro

arXiv.org Artificial IntelligenceApr-26-2024

Dominant pre-trained language models (PLMs) have been successful in high-quality natural language generation. However, the analysis of their generation is not mature: do they acquire generalizable linguistic abstractions, or do they simply memorize and recover substrings of the training data? Especially, few studies focus on domain-specific PLM. In this study, we pre-trained domain-specific GPT-2 models using a limited corpus of Japanese newspaper articles and quantified memorization of training data by comparing them with general Japanese GPT-2 models. Our experiments revealed that domain-specific PLMs sometimes "copy and paste" on a large scale. Furthermore, we replicated the empirical finding that memorization is related to duplication, model size, and prompt length, in Japanese the same as in previous English studies. Our evaluations are relieved from data contamination concerns by focusing on newspaper paywalls, which prevent their use as training data. We hope that our paper encourages a sound discussion such as the security and copyright of PLMs.

evaluation, language model, memorization, (14 more...)

arXiv.org Artificial Intelligence

2404.17143

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > New York > New York County > New York City (0.05)
(6 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Media > News (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.93)

Add feedback

Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations

Sun, Jiaxing, Huang, Weiquan, Wu, Jiang, Gu, Chenya, Li, Wei, Zhang, Songyang, Yan, Hang, He, Conghui

arXiv.org Artificial IntelligenceApr-19-2024

We introduce CHARM, the first benchmark for comprehensively and in-depth evaluating the commonsense reasoning ability of large language models (LLMs) in Chinese, which covers both globally known and Chinese-specific commonsense. We evaluated 7 English and 12 Chinese-oriented LLMs on CHARM, employing 5 representative prompt strategies for improving LLMs' reasoning ability, such as Chain-of-Thought. Our findings indicate that the LLM's language orientation and the task's domain influence the effectiveness of the prompt strategy, which enriches previous research findings. We built closely-interconnected reasoning and memorization tasks, and found that some LLMs struggle with memorizing Chinese commonsense, affecting their reasoning ability, while others show differences in reasoning despite similar memorization performance. We also evaluated the LLMs' memorization-independent reasoning abilities and analyzed the typical errors. Our study precisely identified the LLMs' strengths and weaknesses, providing the clear direction for optimization. It can also serve as a reference for studies in other fields. We will release CHARM at https://github.com/opendatalab/CHARM .

llm, reasoning, reasoning question, (15 more...)

arXiv.org Artificial Intelligence

2403.14112

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Beijing > Beijing (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
(4 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Leisure & Entertainment (1.00)
Education (0.67)
Media > Film (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback