AITopics | Memory-Based Learning

Collaborating Authors

Memory-Based Learning

[Sometimes called Case-Based Reasoning or CBR]
"At the highest level of generality, a general CBR cycle may be described by the following four processes: 1. RETRIEVE the most similar case or cases. 2. REUSE the information and knowledge in that case to solve the problem. 3. REVISE the proposed solution. 4. RETAIN the parts of this experience likely to be useful for future problem solving "– from Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches. By A. Aamodt and E. Plaza. (1994)

News Overviews Instructional Materials AI-Alerts Classics

Watch Out Your Album! On the Inadvertent Privacy Memorization in Multi-Modal Large Language Models

Ju, Tianjie, Hua, Yi, Fei, Hao, Shao, Zhenyu, Zheng, Yubin, Zhao, Haodong, Lee, Mong-Li, Hsu, Wynne, Zhang, Zhuosheng, Liu, Gongshen

arXiv.org Artificial IntelligenceMar-3-2025

Multi-Modal Large Language Models (MLLMs) have exhibited remarkable performance on various vision-language tasks such as Visual Question Answering (VQA). Despite accumulating evidence of privacy concerns associated with task-relevant content, it remains unclear whether MLLMs inadvertently memorize private content that is entirely irrelevant to the training tasks. In this paper, we investigate how randomly generated task-irrelevant private content can become spuriously correlated with downstream objectives due to partial mini-batch training dynamics, thus causing inadvertent memorization. Concretely, we randomly generate task-irrelevant watermarks into VQA fine-tuning images at varying probabilities and propose a novel probing framework to determine whether MLLMs have inadvertently encoded such content. Our experiments reveal that MLLMs exhibit notably different training behaviors in partial mini-batch settings with task-irrelevant watermarks embedded. Furthermore, through layer-wise probing, we demonstrate that MLLMs trigger distinct representational patterns when encountering previously seen task-irrelevant knowledge, even if this knowledge does not influence their output during prompting. Our code is available at https://github.com/illusionhi/ProbingPrivacy.

mllm, private content, username, (13 more...)

arXiv.org Artificial Intelligence

2503.01208

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
(16 more...)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.62)

Add feedback

SolidMark: Evaluating Image Memorization in Generative Models

Kriplani, Nicky, Pham, Minh, Somepalli, Gowthami, Hegde, Chinmay, Cohen, Niv

arXiv.org Artificial IntelligenceMar-1-2025

Recent works have shown that diffusion models are able to memorize training images and emit them at generation time. However, the metrics used to evaluate memorization and its mitigation techniques suffer from dataset-dependent biases and struggle to detect whether a given specific image has been memorized or not. This paper begins with a comprehensive exploration of issues surrounding memorization metrics in diffusion models. Then, to mitigate these issues, we introduce $\rm \style{font-variant: small-caps}{SolidMark}$, a novel evaluation method that provides a per-image memorization score. We then re-evaluate existing memorization mitigation techniques. We also show that $\rm \style{font-variant: small-caps}{SolidMark}$ is capable of evaluating fine-grained pixel-level memorization. Finally, we release a variety of models based on $\rm \style{font-variant: small-caps}{SolidMark}$ to facilitate further research for understanding memorization phenomena in generative models. All of our code is available at https://github.com/NickyDCFP/SolidMark.

diffusion model, memorization, olidm ark, (14 more...)

arXiv.org Artificial Intelligence

2503.00592

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.05)
(12 more...)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Add feedback

Asynchronous Personalized Federated Learning through Global Memorization

Wan, Fan, Li, Yuchen, Qiu, Xueqi, Sun, Rui, Zhang, Leyuan, Miao, Xingyu, Zhang, Tianyu, Duan, Haoran, Long, Yang

arXiv.org Artificial IntelligenceMar-1-2025

The proliferation of Internet of Things devices and advances in communication technology have unleashed an explosion of personal data, amplifying privacy concerns amid stringent regulations like GDPR and CCPA. Federated Learning offers a privacy preserving solution by enabling collaborative model training across decentralized devices without centralizing sensitive data. However, statistical heterogeneity from non-independent and identically distributed datasets and system heterogeneity due to client dropouts particularly those with monopolistic classes severely degrade the global model's performance. To address these challenges, we propose the Asynchronous Personalized Federated Learning framework, which empowers clients to develop personalized models using a server side semantic generator. This generator, trained via data free knowledge transfer under global model supervision, enhances client data diversity by producing both seen and unseen samples, the latter enabled by Zero-Shot Learning to mitigate dropout-induced data loss. To counter the risks of synthetic data impairing training, we introduce a decoupled model interpolation method, ensuring robust personalization. Extensive experiments demonstrate that AP FL significantly outperforms state of the art FL methods in tackling non-IID distributions and client dropouts, achieving superior accuracy and resilience across diverse real-world scenarios.

asynchronous personalized federated learning, global model, heterogeneity, (10 more...)

arXiv.org Artificial Intelligence

2503.00407

Country:

Europe > United Kingdom (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.41)

Add feedback

Does Generation Require Memorization? Creative Diffusion Models using Ambient Diffusion

Shah, Kulin, Kalavasis, Alkis, Klivans, Adam R., Daras, Giannis

arXiv.org Machine LearningFeb-28-2025

There is strong empirical evidence that the state-of-the-art diffusion modeling paradigm leads to models that memorize the training set, especially when the training set is small. Prior methods to mitigate the memorization problem often lead to a decrease in image quality. Is it possible to obtain strong and creative generative models, i.e., models that achieve high generation quality and low memorization? Despite the current pessimistic landscape of results, we make significant progress in pushing the trade-off between fidelity and memorization. We first provide theoretical evidence that memorization in diffusion models is only necessary for denoising problems at low noise scales (usually used in generating high-frequency details). Using this theoretical insight, we propose a simple, principled method to train the diffusion models using noisy data at large noise scales. We show that our method significantly reduces memorization without decreasing the image quality, for both text-conditional and unconditional models and for a variety of data availability settings.

diffusion model, memorization, subpopulation, (12 more...)

arXiv.org Machine Learning

2502.21278

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Add feedback

How Vital is the Jurisprudential Relevance: Law Article Intervened Legal Case Retrieval and Matching

Xu, Nuo, Wang, Pinghui, Liang, Zi, Zhao, Junzhou, Guan, Xiaohong

arXiv.org Artificial IntelligenceFeb-25-2025

Legal case retrieval (LCR) aims to automatically scour for comparable legal cases based on a given query, which is crucial for offering relevant precedents to support the judgment in intelligent legal systems. Due to similar goals, it is often associated with a similar case matching (LCM) task. To address them, a daunting challenge is assessing the uniquely defined legal-rational similarity within the judicial domain, which distinctly deviates from the semantic similarities in general text retrieval. Past works either tagged domain-specific factors or incorporated reference laws to capture legal-rational information. However, their heavy reliance on expert or unrealistic assumptions restricts their practical applicability in real-world scenarios. In this paper, we propose an end-to-end model named LCM-LAI to solve the above challenges. Through meticulous theoretical analysis, LCM-LAI employs a dependent multi-task learning framework to capture legal-rational information within legal cases by a law article prediction (LAP) sub-task, without any additional assumptions in inference. Besides, LCM-LAI proposes an article-aware attention mechanism to evaluate the legal-rational similarity between across-case sentences based on law distribution, which is more effective than conventional semantic similarity. Weperform a series of exhaustive experiments including two different tasks involving four real-world datasets. Results demonstrate that LCM-LAI achieves state-of-the-art performance.

law article, lcm-lai, representation, (13 more...)

arXiv.org Artificial Intelligence

2502.18292

Country:

Asia > China > Shaanxi Province > Xi'an (0.05)
North America > Canada (0.04)
Europe > United Kingdom (0.04)
(4 more...)

Genre: Research Report > New Finding (0.66)

Industry: Law > Criminal Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (0.91)

Add feedback

Quantifying Memorization and Retriever Performance in Retrieval-Augmented Vision-Language Models

Carragher, Peter, Jha, Abhinand, Raghav, R, Carley, Kathleen M.

arXiv.org Artificial IntelligenceFeb-19-2025

Large Language Models (LLMs) demonstrate remarkable capabilities in question answering (QA), but metrics for assessing their reliance on memorization versus retrieval remain underdeveloped. Moreover, while finetuned models are state-of-the-art on closed-domain tasks, general-purpose models like GPT-4o exhibit strong zero-shot performance. This raises questions about the trade-offs between memorization, generalization, and retrieval. In this work, we analyze the extent to which multimodal retrieval-augmented VLMs memorize training data compared to baseline VLMs. Using the WebQA benchmark, we contrast finetuned models with baseline VLMs on multihop retrieval and question answering, examining the impact of finetuning on data memorization. To quantify memorization in end-to-end retrieval and QA systems, we propose several proxy metrics by investigating instances where QA succeeds despite retrieval failing. Our results reveal the extent to which finetuned models rely on memorization. In contrast, retrieval-augmented VLMs have lower memorization scores, at the cost of accuracy (72% vs 52% on WebQA test set). As such, our measures pose a challenge for future work to reconcile memorization and generalization in both Open-Domain QA and joint Retrieval-QA tasks.

accuracy, memorization, retriever, (13 more...)

arXiv.org Artificial Intelligence

2502.13836

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Add feedback

Pruning as a Defense: Reducing Memorization in Large Language Models

Gupta, Mansi, Waghela, Nikhar, Gupta, Sarthak, Goel, Shourya, Shanmugavelu, Sanjif

arXiv.org Artificial IntelligenceFeb-18-2025

Large language models have been shown to memorize significan t portions of their training data, which they can reproduce when appropriately prompted. This work investigates the impact of simple pruning techniques on thi s behavior. Our findings reveal that pruning effectively reduces the extent of m emorization in LLMs, demonstrating its potential as a foundational approach for mitigating membership inference attacks. Large language models are known to memorize portions of thei r training data, which poses significant privacy and security risks. Although various studies h ave explored the extent of memorization in LLMs, most of these efforts are qualitative (Carlini et al .

language model, memorization, pruning, (15 more...)

arXiv.org Artificial Intelligence

2502.15796

Country:

Asia > India > Uttarakhand > Roorkee (0.05)
Europe > United Kingdom (0.04)
Europe > Germany > Berlin (0.04)
Asia > Laos (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.70)

Add feedback

None of the Others: a General Technique to Distinguish Reasoning from Memorization in Multiple-Choice LLM Evaluation Benchmarks

Salido, Eva Sánchez, Gonzalo, Julio, Marco, Guillermo

arXiv.org Artificial IntelligenceFeb-18-2025

In LLM evaluations, reasoning is often distinguished from recall/memorization by performing numerical variations to math-oriented questions. Here we introduce a general variation method for multiple-choice questions that completely dissociates the correct answer from previously seen tokens or concepts, requiring LLMs to understand and reason (rather than memorizing) in order to answer correctly. Using this method, we evaluate state-of-the-art proprietary and open-source LLMs on two datasets available in English and Spanish: the public MMLU benchmark and the private UNED-Access 2024 dataset. Results show that all models experience remarkable accuracy drops under our proposed variation, with an average loss of 57% on MMLU and 50% on UNED-Access 2024, ranging from 10% to 93% across models. Notably, the most accurate model in our experimentation (OpenAI-o3-mini) is not the most robust (DeepSeek-R1-70B), suggesting that the best models in standard evaluations may not be the ones with better reasoning capabilities. Also, we see larger accuracy drops in public (vs private) datasets and questions posed in their original language (vs a manual translation), which are signs of contamination and also point to a relevant role of recall/memorization in current LLMs' answers.

large language model, machine learning, natural language, (6 more...)

arXiv.org Artificial Intelligence

2502.12896

Genre: Research Report (1.00)

Industry: Education (0.60)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.80)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

Improving Similar Case Retrieval Ranking Performance By Revisiting RankSVM

Liu, Yuqi, Zheng, Yan

arXiv.org Artificial IntelligenceFeb-16-2025

Given the rapid development of Legal AI, a lot of attention has been paid to one of the most important legal AI tasks--similar case retrieval, especially with language models to use. In our paper, however, we try to improve the ranking performance of current models from the perspective of learning to rank instead of language models. Specifically, we conduct experiments using a pairwise method--RankSVM as the classifier to substitute a fully connected layer, combined with commonly used language models on similar case retrieval datasets LeCaRDv1 and LeCaRDv2. We finally come to the conclusion that RankSVM could generally help improve the retrieval performance on the LeCaRDv1 and LeCaRDv2 datasets compared with original classifiers by optimizing the precise ranking. It could also help mitigate overfitting owing to class imbalance. Our code is available in https://github.com/liuyuqi123study/RankSVM_for_SLR

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.11131

Country:

North America > United States (0.47)
Asia > China (0.46)

Genre: Research Report (0.84)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (0.84)

Add feedback

Logarithmic Width Suffices for Robust Memorization

Egosi, Amitsour, Yehudai, Gilad, Shamir, Ohad

arXiv.org Machine LearningFeb-16-2025

The ability of neural networks to memorize labeled datasets is a central question in the study of their expressive power. Given some input domain X, output domain Y, and dataset size N, we say that a network memorizes datasets of size N, if for every labeled dataset D X Y, where |D| = N, we can find parameters such that the resulting network f: X Y perfectly fits the dataset (that is, f(x) = y for every labeled pair (x, y) D). The main question here - which has been studied in many recent works (see Section 2 for details) - is to characterize the size/architecture of the networks that have enough expressive power to memorize any dataset of a given size N. However, merely fitting a given dataset is not enough for most tasks, and a desirable property for trained networks is that they remain robust to noise and minor modifications in the dataset. This robustness property allows neural networks to generalize from observed data points to unseen data points. Furthermore, neural networks have been shown to be vulnerable to adversarial attacks [Szegedy et al., 2013, Carlini and Wagner, 2017, Papernot et al., 2017, Athalye et al., 2018] in the form of slightly perturbed examples, where (in the context of visual data) the perturbation is often imperceptible to the human eye. Moreover, existing constructions of memorizing networks are often quite delicate, and not at all robust to such perturbations. This motivates the question of characterizing the networks that have enough capacity to robustly memorize a dataset.

artificial intelligence, dataset, machine learning, (16 more...)

arXiv.org Machine Learning

2502.11162

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.43)

Add feedback