AITopics | cancellation effect

Collaborating Authors

cancellation effect

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

d07022783ff6f7bf7a288c207b7dcbd1-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 01:21:46 GMT

cancellation effect, gradient, tracin-we, (13 more...)

Neural Information Processing Systems

Country:

Europe > Italy > Marche > Ancona Province > Ancona (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.90)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)

Add feedback

d07022783ff6f7bf7a288c207b7dcbd1-Paper-Conference.pdf

Neural Information Processing SystemsAug-19-2025, 03:04:52 GMT

machine learning, natural language, tracin-we, (17 more...)

Neural Information Processing Systems

Country:

Europe > Italy > Marche > Ancona Province > Ancona (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)

Add feedback

DoGE: Domain Reweighting with Generalization Estimation

Fan, Simin, Pagliardini, Matteo, Jaggi, Martin

arXiv.org Artificial IntelligenceFeb-5-2024

The coverage and composition of the pretraining data significantly impacts the generalization ability of Large Language Models (LLMs). Despite its importance, recent LLMs still rely on heuristics and trial and error to increase or reduce the influence of data-domains. We propose DOmain reweighting with Generalization Estimation (DoGE), which optimizes the probability of sampling from each domain (domain weights) in a principled way. Our approach is a two-stage process consisting of (i) training a proxy model to obtain domain weights using a bi-level optimization algorithm; (ii) training a larger base model by sampling training domains according to the learned domain weights. In our experiments, we extensively show how DoGE improves the generalization of the base model to any target data mixture. On the SlimPajama dataset, our base model gets better perplexity and few-shot reasoning accuracies across $6$ tasks compared to baseline methods. Moreover, aiming to generalize to out-of-domain target tasks, which is unseen in the pretraining corpus (OOD domain), DoGE can effectively identify inter-domain dependencies, and consistently achieves better test perplexity on the target domain.

domain reweighting, domain weight, proxy model, (15 more...)

arXiv.org Artificial Intelligence

2310.15393

Country:

North America > United States > New York (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > Switzerland (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)

Add feedback

First is Better Than Last for Language Data Influence

Yeh, Chih-Kuan, Taly, Ankur, Sundararajan, Mukund, Liu, Frederick, Ravikumar, Pradeep

arXiv.org Artificial IntelligenceOct-27-2022

The ability to identify influential training examples enables us to debug training data and explain model behavior. Existing techniques to do so are based on the flow of training data influence through the model parameters. For large models in NLP applications, it is often computationally infeasible to study this flow through all model parameters, therefore techniques usually pick the last layer of weights. However, we observe that since the activation connected to the last layer of weights contains "shared logic", the data influenced calculated via the last layer weights prone to a ``cancellation effect'', where the data influence of different examples have large magnitude that contradicts each other. The cancellation effect lowers the discriminative power of the influence score, and deleting influential examples according to this measure often does not change the model's behavior by much. To mitigate this, we propose a technique called TracIn-WE that modifies a method called TracIn to operate on the word embedding layer instead of the last layer, where the cancellation effect is less severe. One potential concern is that influence based on the word embedding layer may not encode sufficient high level information. However, we find that gradients (unlike embeddings) do not suffer from this, possibly because they chain through higher layers. We show that TracIn-WE significantly outperforms other data influence methods applied on the last layer significantly on the case deletion evaluation on three language classification tasks for different models. In addition, TracIn-WE can produce scores not just at the level of the overall training input, but also at the level of words within the training input, a further aid in debugging.

machine learning, natural language, tracin-we, (22 more...)

arXiv.org Artificial Intelligence

2202.11844

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Texas (0.04)
North America > United States > Montana (0.04)
(16 more...)

Genre: Research Report > New Finding (0.45)

Industry:

Banking & Finance > Trading (0.93)
Leisure & Entertainment > Sports > Baseball (0.93)
Law (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.89)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback