LMD3: Language Model Data Density Dependence

Kirchenbauer, John, Honke, Garrett, Somepalli, Gowthami, Geiping, Jonas, Ippolito, Daphne, Lee, Katherine, Goldstein, Tom, Andre, David

May-10-2024–arXiv.org Artificial Intelligence

We develop a methodology for analyzing language model task performance at the individual example level based on training data density estimation. Experiments with paraphrasing as a controlled intervention on finetuning data demonstrate that increasing the support in the training distribution for specific test queries results in a measurable increase in density, which is also a significant predictor of the performance increase caused by the intervention. Experiments with pretraining data demonstrate that we can explain a significant fraction of the variance in model perplexity via density measurements. We conclude that our framework can provide statistical evidence of the dependence of a target model's predictions on subsets of its training data, and can more generally be used to characterize the support (or lack thereof) in the training data for a given test task.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

May-10-2024

arXiv.org PDF

Add feedback

Country:
- Europe > Austria > Vienna (0.14)

Genre:
- Research Report
  - Experimental Study (0.69)
  - New Finding (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language > Large Language Model (0.95)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found