AITopics | Memory-Based Learning

Collaborating Authors

Memory-Based Learning

[Sometimes called Case-Based Reasoning or CBR]
"At the highest level of generality, a general CBR cycle may be described by the following four processes: 1. RETRIEVE the most similar case or cases. 2. REUSE the information and knowledge in that case to solve the problem. 3. REVISE the proposed solution. 4. RETAIN the parts of this experience likely to be useful for future problem solving "– from Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches. By A. Aamodt and E. Plaza. (1994)

News Overviews Instructional Materials AI-Alerts Classics

Mitigating Approximate Memorization in Language Models via Dissimilarity Learned Policy

Kassem, Aly M.

arXiv.org Artificial IntelligenceMay-2-2023

Large Language models (LLMs) are trained on large amounts of data, which can include sensitive information that may compromise personal privacy. LLMs showed to memorize parts of the training data and emit those data verbatim when an adversary prompts appropriately. Previous research has primarily focused on data preprocessing and differential privacy techniques to address memorization or prevent verbatim memorization exclusively, which can give a false sense of privacy. However, these methods rely on explicit and implicit assumptions about the structure of the data to be protected, which often results in an incomplete solution to the problem. To address this, we propose a novel framework that utilizes a reinforcement learning approach (PPO) to fine-tune LLMs to mitigate approximate memorization. Our approach utilizes a negative similarity score, such as BERTScore or SacreBLEU, as a reward signal to learn a dissimilarity policy. Our results demonstrate that this framework effectively mitigates approximate memorization while maintaining high levels of coherence and fluency in the generated samples. Furthermore, our framework is robust in mitigating approximate memorization across various circumstances, including longer context, which is known to increase memorization in LLMs.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2305.0155

Country:

North America > United States > Ohio (0.05)
North America > Canada > Ontario > Essex County > Windsor (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > India (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Add feedback

Industry Classification Using a Novel Financial Time-Series Case Representation

Dolphin, Rian, Smyth, Barry, Dong, Ruihai

arXiv.org Artificial IntelligenceApr-29-2023

The financial domain has proven to be a fertile source of challenging machine learning problems across a variety of tasks including prediction, clustering, and classification. Researchers can access an abundance of time-series data and even modest performance improvements can be translated into significant additional value. In this work, we consider the use of case-based reasoning for an important task in this domain, by using historical stock returns time-series data for industry sector classification. We discuss why time-series data can present some significant representational challenges for conventional case-based reasoning approaches, and in response, we propose a novel representation based on stock returns embeddings, which can be readily calculated from raw stock returns data. We argue that this representation is well suited to case-based reasoning and evaluate our approach using a large-scale public dataset for the industry sector classification task, demonstrating substantial performance improvements over several baselines using more conventional representations.

artificial intelligence, machine learning, representation, (16 more...)

arXiv.org Artificial Intelligence

2305.00245

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Spain > Castile and León > Salamanca Province > Salamanca (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(6 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Energy (1.00)
Banking & Finance > Trading (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)

Add feedback

CrossSplit: Mitigating Label Noise Memorization through Data Splitting

Kim, Jihye, Baratin, Aristide, Zhang, Yan, Lacoste-Julien, Simon

arXiv.org Artificial IntelligenceApr-26-2023

We approach the problem of improving robustness of deep learning algorithms in the presence of label noise. Building upon existing label correction and co-teaching methods, we propose a novel training procedure to mitigate the memorization of noisy labels, called CrossSplit, which uses a pair of neural networks trained on two disjoint parts of the labelled dataset. CrossSplit combines two main ingredients: (i) Cross-split label correction. The idea is that, since the model trained on one part of the data cannot memorize example-label pairs from the other part, the training labels presented to each network can be smoothly adjusted by using the predictions of its peer network; (ii) Cross-split semi-supervised training. A network trained on one part of the data also uses the unlabeled inputs of the other part. Extensive experiments on CIFAR-10, CIFAR-100, Tiny-ImageNet and mini-WebVision datasets demonstrate that our method can outperform the current state-of-the-art in a wide range of noise ratios.

artificial intelligence, crosssplit, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2212.01674

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Asia > South Korea > Gyeonggi-do > Suwon (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.65)

Add feedback

SAILER: Structure-aware Pre-trained Language Model for Legal Case Retrieval

Li, Haitao, Ai, Qingyao, Chen, Jia, Dong, Qian, Wu, Yueyue, Liu, Yiqun, Chen, Chong, Tian, Qi

arXiv.org Artificial IntelligenceApr-22-2023

Legal case retrieval, which aims to find relevant cases for a query case, plays a core role in the intelligent legal system. Despite the success that pre-training has achieved in ad-hoc retrieval tasks, effective pre-training strategies for legal case retrieval remain to be explored. Compared with general documents, legal case documents are typically long text sequences with intrinsic logical structures. However, most existing language models have difficulty understanding the long-distance dependencies between different structures. Moreover, in contrast to the general retrieval, the relevance in the legal domain is sensitive to key legal elements. Even subtle differences in key legal elements can significantly affect the judgement of relevance. However, existing pre-trained language models designed for general purposes have not been equipped to handle legal elements. To address these issues, in this paper, we propose SAILER, a new Structure-Aware pre-traIned language model for LEgal case Retrieval. It is highlighted in the following three aspects: (1) SAILER fully utilizes the structural information contained in legal case documents and pays more attention to key legal elements, similar to how legal experts browse legal case documents. (2) SAILER employs an asymmetric encoder-decoder architecture to integrate several different pre-training objectives. In this way, rich semantic information across tasks is encoded into dense vectors. (3) SAILER has powerful discriminative ability, even without any legal annotation data. It can distinguish legal cases with different charges accurately. Extensive experiments over publicly available legal benchmarks demonstrate that our approach can significantly outperform previous state-of-the-art methods in legal case retrieval.

machine learning, natural language, sailer, (12 more...)

arXiv.org Artificial Intelligence

2304.1137

Country:

Asia > Taiwan > Taiwan Province > Taipei (0.05)
North America > United States > New York > New York County > New York City (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Law > Statutes (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)

Add feedback

MProtoNet: A Case-Based Interpretable Model for Brain Tumor Classification with 3D Multi-parametric Magnetic Resonance Imaging

Wei, Yuanyuan, Tam, Roger, Tang, Xiaoying

arXiv.org Artificial IntelligenceApr-14-2023

While most explainable deep learning applications use post hoc methods (such as GradCAM) to generate feature attribution maps, there is a new type of case-based reasoning models, namely ProtoPNet and its variants, which identify prototypes during training and compare input image patches with those prototypes. We propose the first medical prototype network (MProtoNet) to extend ProtoPNet to brain tumor classification with 3D multi-parametric magnetic resonance imaging (mpMRI) data. To address different requirements between 2D natural images and 3D mpMRIs especially in terms of localizing attention regions, a new attention module with soft masking and online-CAM loss is introduced. Soft masking helps sharpen attention maps, while online-CAM loss directly utilizes image-level labels when training the attention module. MProtoNet achieves statistically significant improvements in interpretability metrics of both correctness and localization coherence (with a best activation precision of 0.713 0.058) without humanannotated labels during training, when compared with GradCAM and several ProtoPNet variants. The source code is available at https://github.com/aywi/mprotonet.

artificial intelligence, deep learning, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2304.06258

Country:

Asia > China > Guangdong Province > Shenzhen (0.05)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
(7 more...)

Genre: Research Report > Experimental Study (0.69)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (0.93)

Add feedback

Researchers used machine learning to improve the first photo of a black hole

EngadgetApr-13-2023, 17:07:22 GMT

Researchers have used machine learning to tighten up a previously released image of a black hole. As a result, the portrait of the black hole at the center of the galaxy Messier 87, over 53 million light years away from Earth, shows a thinner ring of light and matter surrounding its center in a report published today in The Astrophysical Journal Letters. The original images were captured in 2017 by the Event Horizon Telescope (EHT), a network of radio telescopes around Earth that combine to act as a planet-sized super-imaging tool. The initial picture looked like a "fuzzy donut," as described by NPR, but researchers used a new method called PRIMO to reconstruct a more accurate image. PRIMO is "a novel dictionary-learning-based algorithm" that learns to "recover high-fidelity images even in the presence of sparse coverage" by training on generated simulations of over 30,000 black holes.

black hole, researcher, simulation, (6 more...)

Engadget

Country: North America > United States > New Jersey > Mercer County > Princeton (0.06)

Genre: Research Report (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (0.40)

Add feedback

Beating a Defender in Robotic Soccer: Memory-Based Learning of a Continuous Function

Neural Information Processing SystemsApr-6-2023, 18:17:55 GMT

Learning how to adjust to an opponent's position is critical to the success of having intelligent agents collaborating towards the achievement of specific tasks in unfriendly environments. This pa(cid:173) per describes our work on a Memory-based technique for to choose an action based on a continuous-valued state attribute indicating the position of an opponent. We investigate the question of how an agent performs in nondeterministic variations of the training situ(cid:173) ations. Our experiments indicate that when the random variations fall within some bound of the initial training, the agent performs better with some initial training rather than from a tabula-rasa.

continuous function, memory-based learning, robotic soccer, (4 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Sports > Soccer (0.85)

Technology:

Information Technology > Artificial Intelligence > Robots > Soccer Robots (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (0.40)

Add feedback

Memorization-Dilation: Modeling Neural Collapse Under Label Noise

Nguyen, Duc Anh, Levie, Ron, Lienen, Julian, Kutyniok, Gitta, Hüllermeier, Eyke

arXiv.org Artificial IntelligenceApr-4-2023

The notion of neural collapse refers to several emergent phenomena that have been empirically observed across various canonical classification problems. During the terminal phase of training a deep neural network, the feature embedding of all examples of the same class tend to collapse to a single representation, and the features of different classes tend to separate as much as possible. Neural collapse is often studied through a simplified model, called the unconstrained feature representation, in which the model is assumed to have "infinite expressivity" and can map each data point to any arbitrary representation. In this work, we propose a more realistic variant of the unconstrained feature representation that takes the limited expressivity of the network into account. Empirical evidence suggests that the memorization of noisy data points leads to a degradation (dilation) of the neural collapse. Using a model of the memorization-dilation (M-D) phenomenon, we show one mechanism by which different losses lead to different performances of the trained network on noisy data. Our proofs reveal why label smoothing, a modification of cross-entropy empirically observed to produce a regularization effect, leads to improved generalization in classification tasks.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2206.0553

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(13 more...)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.84)

Add feedback

Memorization Capacity of Neural Networks with Conditional Computation

Koyuncu, Erdem

arXiv.org Artificial IntelligenceMar-20-2023

Many empirical studies have demonstrated the performance benefits of conditional computation in neural networks, including reduced inference time and power consumption. We study the fundamental limits of neural conditional computation from the perspective of memorization capacity. For Rectified Linear Unit (ReLU) networks without conditional computation, it is known that memorizing a collection of $n$ input-output relationships can be accomplished via a neural network with $O(\sqrt{n})$ neurons. Calculating the output of this neural network can be accomplished using $O(\sqrt{n})$ elementary arithmetic operations of additions, multiplications and comparisons for each input. Using a conditional ReLU network, we show that the same task can be accomplished using only $O(\log n)$ operations per input. This represents an almost exponential improvement as compared to networks without conditional computation. We also show that the $\Theta(\log n)$ rate is the best possible. Our achievability result utilizes a general methodology to synthesize a conditional network out of an unconditional network in a computationally-efficient manner, bridging the gap between unconditional and conditional architectures.

artificial intelligence, machine learning, neuron, (19 more...)

arXiv.org Artificial Intelligence

2303.11247

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.62)

Add feedback

Learning to Select Prototypical Parts for Interpretable Sequential Data Modeling

Zhang, Yifei, Gao, Neng, Ma, Cunqing

arXiv.org Artificial IntelligenceMar-16-2023

Prototype-based interpretability methods provide intuitive explanations of model prediction by comparing samples to a reference set of memorized exemplars or typical representatives in terms of similarity. In the field of sequential data modeling, similarity calculations of prototypes are usually based on encoded representation vectors. However, due to highly recursive functions, there is usually a non-negligible disparity between the prototype-based explanations and the original input. In this work, we propose a Self-Explaining Selective Model (SESM) that uses a linear combination of prototypical concepts to explain its own predictions. The model employs the idea of case-based reasoning by selecting sub-sequences of the input that mostly activate different concepts as prototypical parts, which users can compare to sub-sequences selected from different example inputs to understand model decisions. For better interpretability, we design multiple constraints including diversity, stability, and locality as training objectives. Extensive experiments in different domains demonstrate that our method exhibits promising interpretability and competitive accuracy.

artificial intelligence, information management, machine learning, (2 more...)

arXiv.org Artificial Intelligence

2212.03396

Genre: Research Report (0.40)

Technology:

Information Technology > Information Management (0.60)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (0.53)

Add feedback