AITopics | Memory-Based Learning

Collaborating Authors

Memory-Based Learning

[Sometimes called Case-Based Reasoning or CBR]
"At the highest level of generality, a general CBR cycle may be described by the following four processes: 1. RETRIEVE the most similar case or cases. 2. REUSE the information and knowledge in that case to solve the problem. 3. REVISE the proposed solution. 4. RETAIN the parts of this experience likely to be useful for future problem solving "– from Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches. By A. Aamodt and E. Plaza. (1994)

News Overviews Instructional Materials AI-Alerts Classics

Data-centric NLP Backdoor Defense from the Lens of Memorization

Wang, Zhenting, Wang, Zhizhi, Jin, Mingyu, Du, Mengnan, Zhai, Juan, Ma, Shiqing

arXiv.org Artificial IntelligenceSep-21-2024

Backdoor attack is a severe threat to the trustworthiness of DNN-based language models. In this paper, we first extend the definition of memorization of language models from sample-wise to more fine-grained sentence element-wise (e.g., word, phrase, structure, and style), and then point out that language model backdoors are a type of element-wise memorization. Through further analysis, we find that the strength of such memorization is positively correlated to the frequency of duplicated elements in the training dataset. In conclusion, duplicated sentence elements are necessary for successful backdoor attacks. Based on this, we propose a data-centric defense. We first detect trigger candidates in training data by finding memorizable elements, i.e., duplicated elements, and then confirm real triggers by testing if the candidates can activate backdoor behaviors (i.e., malicious elements). Results show that our method outperforms state-of-the-art defenses in defending against different types of NLP backdoors.

backdoor attack, memorization, training data, (12 more...)

arXiv.org Artificial Intelligence

2409.142

Country:

North America > United States > Hawaii (0.04)
Asia > Nepal (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.68)

Add feedback

Unlocking Memorization in Large Language Models with Dynamic Soft Prompting

Wang, Zhepeng, Bao, Runxue, Wu, Yawen, Taylor, Jackson, Xiao, Cao, Zheng, Feng, Jiang, Weiwen, Gao, Shangqian, Zhang, Yanfu

arXiv.org Artificial IntelligenceSep-20-2024

Pretrained large language models (LLMs) have revolutionized natural language processing (NLP) tasks such as summarization, question answering, and translation. However, LLMs pose significant security risks due to their tendency to memorize training data, leading to potential privacy breaches and copyright infringement. Accurate measurement of this memorization is essential to evaluate and mitigate these potential risks. However, previous attempts to characterize memorization are constrained by either using prefixes only or by prepending a constant soft prompt to the prefixes, which cannot react to changes in input. To address this challenge, we propose a novel method for estimating LLM memorization using dynamic, prefix-dependent soft prompts. Our approach involves training a transformer-based generator to produce soft prompts that adapt to changes in input, thereby enabling more accurate extraction of memorized data. Our method not only addresses the limitations of previous methods but also demonstrates superior performance in diverse experimental settings compared to state-of-the-art techniques. In particular, our method can achieve the maximum relative improvement of 112.75% and 32.26% over the vanilla baseline in terms of discoverable memorization rate for the text generation task and code generation task respectively.

language model, memorization, soft prompt, (14 more...)

arXiv.org Artificial Intelligence

2409.13853

Country: Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report > Promising Solution (0.68)

Industry:

Law > Intellectual Property & Technology Law (0.54)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

Improving Prototypical Parts Abstraction for Case-Based Reasoning Explanations Designed for the Kidney Stone Type Recognition

Flores-Araiza, Daniel, Lopez-Tiro, Francisco, Larose, Clément, Hinojosa, Salvador, Mendez-Vazquez, Andres, Gonzalez-Mendoza, Miguel, Ochoa-Ruiz, Gilberto, Daul, Christian

arXiv.org Artificial IntelligenceSep-19-2024

The in-vivo identification of the kidney stone types during an ureteroscopy would be a major medical advance in urology, as it could reduce the time of the tedious renal calculi extraction process, while diminishing infection risks. Furthermore, such an automated procedure would make possible to prescribe anti-recurrence treatments immediately. Nowadays, only few experienced urologists are able to recognize the kidney stone types in the images of the videos displayed on a screen during the endoscopy. Thus, several deep learning (DL) models have recently been proposed to automatically recognize the kidney stone types using ureteroscopic images. However, these DL models are of black box nature whicl limits their applicability in clinical settings. This contribution proposes a case-based reasoning DL model which uses prototypical parts (PPs) and generates local and global descriptors. The PPs encode for each class (i.e., kidney stone type) visual feature information (hue, saturation, intensity and textures) similar to that used by biologists. The PPs are optimally generated due a new loss function used during the model training. Moreover, the local and global descriptors of PPs allow to explain the decisions ("what" information, "where in the images") in an understandable way for biologists and urologists. The proposed DL model has been tested on a database including images of the six most widespread kidney stone types. The overall average classification accuracy was 90.37. When comparing this results with that of the eight other DL models of the kidney stone state-of-the-art, it can be seen that the valuable gain in explanability was not reached at the expense of accuracy which was even slightly increased with respect to that (88.2) of the best method of the literature. These promising and interpretable results also encourage urologists to put their trust in AI-based solutions.

case-based reasoning explanation designed, kidney stone type recognition, prototypical part abstraction

arXiv.org Artificial Intelligence

2409.12883

Genre: Research Report (0.69)

Industry:

Health & Medicine > Therapeutic Area > Urology (1.00)
Health & Medicine > Therapeutic Area > Nephrology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)

Add feedback

Promise and Peril of Collaborative Code Generation Models: Balancing Effectiveness and Memorization

Chen, Zhi, Jiang, Lingxiao

arXiv.org Artificial IntelligenceSep-18-2024

In the rapidly evolving field of machine learning, training models with datasets from various locations and organizations presents significant challenges due to privacy and legal concerns. The exploration of effective collaborative training settings capable of leveraging valuable knowledge from distributed and isolated datasets is increasingly crucial. This study investigates key factors that impact the effectiveness of collaborative training methods in code next-token prediction, as well as the correctness and utility of the generated code, demonstrating the promise of such methods. Additionally, we evaluate the memorization of different participant training data across various collaborative training settings, including centralized, federated, and incremental training, highlighting their potential risks in leaking data. Our findings indicate that the size and diversity of code datasets are pivotal factors influencing the success of collaboratively trained code models. We show that federated learning achieves competitive performance compared to centralized training while offering better data protection, as evidenced by lower memorization ratios in the generated code. However, federated learning can still produce verbatim code snippets from hidden training data, potentially violating privacy or copyright. Our study further explores effectiveness and memorization patterns in incremental learning, emphasizing the sequence in which individual participant datasets are introduced. We also identify cross-organizational clones as a prevalent challenge in both centralized and federated learning scenarios. Our findings highlight the persistent risk of data leakage during inference, even when training data remains unseen. We conclude with recommendations for practitioners and researchers to optimize multisource datasets, propelling cross-organizational collaboration forward.

balancing effectiveness and memorization, collaborative code generation model, promise and peril

arXiv.org Artificial Intelligence

2409.1202

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.53)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Add feedback

Generalization vs. Memorization in the Presence of Statistical Biases in Transformers

Mitros, John

arXiv.org Machine LearningSep-6-2024

This study aims to understand how statistical biases affect the model's ability to generalize to in-distribution and out-of-distribution data on algorithmic tasks. Prior research indicates that transformers may inadvertently learn to rely on these spurious correlations, leading to an overestimation of their generalization capabilities. To investigate this, we evaluate transformer models on several synthetic algorithmic tasks, systematically introducing and varying the presence of these biases. We also analyze how different components of the transformer models impact their generalization. Our findings suggest that statistical biases impair the model's performance on out-of-distribution data, providing a overestimation of its generalization capabilities. The models rely heavily on these spurious correlations for inference, as indicated by their performance on tasks including such biases.

memorization, statistical bias, transformer

arXiv.org Machine Learning

2409.04654

Genre: Research Report > New Finding (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.40)

Add feedback

The Unreasonable Ineffectiveness of Nucleus Sampling on Mitigating Text Memorization

Borec, Luka, Sadler, Philipp, Schlangen, David

arXiv.org Artificial IntelligenceAug-29-2024

This work analyses the text memorization behavior of large language models (LLMs) when subjected to nucleus sampling. Stochastic decoding methods like nucleus sampling are typically applied to overcome issues such as monotonous and repetitive text generation, which are often observed with maximization-based decoding techniques. We hypothesize that nucleus sampling might also reduce the occurrence of memorization patterns, because it could lead to the selection of tokens outside the memorized sequence. To test this hypothesis we create a diagnostic dataset with a known distribution of duplicates that gives us some control over the likelihood of memorization of certain parts of the training data. Our analysis of two GPT-Neo models fine-tuned on this dataset interestingly shows that (i) an increase of the nucleus size reduces memorization only modestly, and (ii) even when models do not engage in "hard" memorization -- a verbatim reproduction of training samples -- they may still display "soft" memorization whereby they generate outputs that echo the training data but without a complete one-by-one resemblance.

mitigating text memorization, nucleus sampling, unreasonable ineffectiveness

arXiv.org Artificial Intelligence

2408.16345

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Add feedback

Towards Case-based Interpretability for Medical Federated Learning

Latorre, Laura, Petrychenko, Liliana, Beets-Tan, Regina, Kopytova, Taisiya, Silva, Wilson

arXiv.org Artificial IntelligenceAug-24-2024

Even though federated learning's potential to overcome Case-based interpretability is vital in explaining medical some of the current AI flaws is currently widely recognized, Artificial Intelligence (AI) model decisions. Generating it also introduces new challenges. The decentralized nature explanations for AI model decisions is paramount to increasing of federated learning guarantees compliance with privacy trust and allowing widespread adoption in clinical regulations but, at the same time, inhibits data access and practice [1]. We can find several approaches to producing inspection [7]. Non-accessible data means that identifying explanations in the scientific literature, from saliency maps bugs or detecting biases is impossible following conventional (highlighting image pixels driving the decision) to textual approaches. The same is true for case-based explainability.

case-based explanation, dataset, generative model, (12 more...)

arXiv.org Artificial Intelligence

2408.13626

Country:

Europe > Netherlands > North Holland > Amsterdam (0.05)
Europe > Netherlands > Limburg > Maastricht (0.05)
South America > Peru > Lima Department > Lima Province > Lima (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Nuclear Medicine (0.99)
Health & Medicine > Therapeutic Area (0.71)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (0.87)

Add feedback

iSee: Advancing Multi-Shot Explainable AI Using Case-based Recommendations

Wijekoon, Anjana, Wiratunga, Nirmalie, Corsar, David, Martin, Kyle, Nkisi-Orji, Ikechukwu, Palihawadana, Chamath, Caro-Martínez, Marta, Díaz-Agudo, Belen, Bridge, Derek, Liret, Anne

arXiv.org Artificial IntelligenceAug-23-2024

Explainable AI (XAI) can greatly enhance user trust and satisfaction in AI-assisted decision-making processes. Recent findings suggest that a single explainer may not meet the diverse needs of multiple users in an AI system; indeed, even individual users may require multiple explanations. This highlights the necessity for a "multi-shot" approach, employing a combination of explainers to form what we introduce as an "explanation strategy". Tailored to a specific user or a user group, an "explanation experience" describes interactions with personalised strategies designed to enhance their AI decision-making processes. The iSee platform is designed for the intelligent sharing and reuse of explanation experiences, using Case-based Reasoning to advance best practices in XAI. The platform provides tools that enable AI system designers, i.e. design users, to design and iteratively revise the most suitable explanation strategy for their AI system to satisfy end-user needs. All knowledge generated within the iSee platform is formalised by the iSee ontology for interoperability. We use a summative mixed methods study protocol to evaluate the usability and utility of the iSee platform with six design users across varying levels of AI and XAI expertise. Our findings confirm that the iSee platform effectively generalises across applications and its potential to promote the adoption of XAI best practices.

design user, explanation, explanation strategy, (14 more...)

arXiv.org Artificial Intelligence

2408.12941

Country:

Europe > Spain > Galicia > Madrid (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.95)
Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)

Add feedback

Memorization In In-Context Learning

Golchin, Shahriar, Surdeanu, Mihai, Bethard, Steven, Blanco, Eduardo, Riloff, Ellen

arXiv.org Artificial IntelligenceAug-21-2024

In-context learning (ICL) has proven to be an effective strategy for improving the performance of large language models (LLMs) with no additional training. However, the exact mechanism behind these performance improvements remains unclear. This study is the first to show how ICL surfaces memorized training data and to explore the correlation between this memorization and performance across various ICL regimes: zero-shot, few-shot, and many-shot. Our most notable findings include: (1) ICL significantly surfaces memorization compared to zero-shot learning in most cases; (2) demonstrations, without their labels, are the most effective element in surfacing memorization; (3) ICL improves performance when the surfaced memorization in few-shot regimes reaches a high level (about 40%); and (4) there is a very strong correlation between performance and memorization in ICL when it outperforms zero-shot learning. Overall, our study uncovers a hidden phenomenon -- memorization -- at the core of ICL, raising an important question: to what extent do LLMs truly generalize from demonstrations in ICL, and how much of their success is due to memorization?

in-context learning, memorization

arXiv.org Artificial Intelligence

2408.11546

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Add feedback

ALTBI: Constructing Improved Outlier Detection Models via Optimization of Inlier-Memorization Effect

Cho, Seoyoung, Hwang, Jaesung, Bak, Kwan-Young, Kim, Dongha

arXiv.org Machine LearningAug-19-2024

Outlier detection (OD) is the task of identifying unusual observations (or outliers) from a given or upcoming data by learning unique patterns of normal observations (or inliers). Recently, a study introduced a powerful unsupervised OD (UOD) solver based on a new observation of deep generative models, called inlier-memorization (IM) effect, which suggests that generative models memorize inliers before outliers in early learning stages. In this study, we aim to develop a theoretically principled method to address UOD tasks by maximally utilizing the IM effect. We begin by observing that the IM effect is observed more clearly when the given training data contain fewer outliers. This finding indicates a potential for enhancing the IM effect in UOD regimes if we can effectively exclude outliers from mini-batches when designing the loss function. To this end, we introduce two main techniques: 1) increasing the mini-batch size as the model training proceeds and 2) using an adaptive threshold to calculate the truncated loss function. We theoretically show that these two techniques effectively filter out outliers from the truncated loss function, allowing us to utilize the IM effect to the fullest. Coupled with an additional ensemble strategy, we propose our method and term it Adaptive Loss Truncation with Batch Increment (ALTBI). We provide extensive experimental results to demonstrate that ALTBI achieves state-of-the-art performance in identifying outliers compared to other recent methods, even with significantly lower computation costs. Additionally, we show that our method yields robust performances when combined with privacy-preserving algorithms.

constructing improved outlier detection model, inlier-memorization effect, optimization, (1 more...)

arXiv.org Machine Learning

2408.09791

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.60)

Add feedback