ResMem: Learn what you can and memorize the rest

Yang, Zitong, Lukasik, Michal, Nagarajan, Vaishnavh, Li, Zonglin, Rawat, Ankit Singh, Zaheer, Manzil, Menon, Aditya Krishna, Kumar, Sanjiv

Oct-20-2023–arXiv.org Machine Learning

The impressive generalization performance of modern neural networks is attributed in part to their ability to implicitly memorize complex training patterns. Inspired by this, we explore a novel mechanism to improve model generalization via explicit memorization. Specifically, we propose the residual-memorization (ResMem) algorithm, a new method that augments an existing prediction model (e.g., a neural network) by fitting the model's residuals with a k-nearest neighbor based regressor. The final prediction is then the sum of the original model and the fitted residual regressor. By construction, ResMem can explicitly memorize the training labels, even when the base model has low capacity. We start by formulating a stylized linear regression problem and rigorously show that ResMem results in a more favorable test risk over a base linear neural network. Then, we empirically show that ResMem consistently improves the test set generalization of the original prediction model across standard vision and natural language processing benchmarks.

artificial intelligence, machine learning, resmem, (19 more...)

arXiv.org Machine Learning

Oct-20-2023

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - New York > New York County
      - New York City (0.05)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - California
      - Santa Clara County > Stanford (0.04)
      - Los Angeles County > Long Beach (0.04)
  - Puerto Rico > San Juan
    - San Juan (0.04)
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.04)
- Europe
  - Spain (0.04)
  - France (0.04)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report
  - New Finding (0.67)
  - Promising Solution (0.48)

Industry:
- Education (0.67)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks (1.00)
  - Performance Analysis > Accuracy (0.46)
  - Statistical Learning
    - Nearest Neighbor Methods (0.69)
    - Regression (0.49)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found