AITopics | reversal

Collaborating Authors

reversal

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Regime-Conditioned Evaluation in Multi-Context Bayesian Optimization

Thomas, Noel

arXiv.org Machine LearningMay-7-2026

Published transfer-BO comparisons often estimate an average treatment effect of acquisition choice over hidden regime variables, while practitioners need the conditional effect for their specific prior quality, budget ratio, and metric. An audit of 40 transfer-BO papers from NeurIPS, ICML, ICLR, AISTATS, UAI, TMLR, JMLR, and AutoML-Conf (2022-2025) finds that 98% never vary B/|A| as a controlled axis. On the same GDSC2 benchmark, changing only the budget reverses the ranking: at B=50, Greedy outperforms UCB by 0.050 Hit@1, while at B=100, UCB outperforms Greedy by 0.035. We capture this transition with the Portable Regime Score PRS=(B/|A|)(1-rho), where rho is the prior rank correlation and can be estimated from pilot contexts before the main comparison. Across 79 conditions spanning chemistry, drug-response biology, and HPO, a hierarchical model gives beta=0.50 (p=1.1e-9), and 19% of conditions fall in an equivalence zone where |advantage|<0.01 Hit@1. In five published reversal cases, PRS predicts the winner from pre-comparison observables. A No-Free-Leaderboard proposition explains why unconditional rankings are unstable: when CATE changes sign across regimes, the reported ATE becomes a function of benchmark mixture. RegimePlanner, which estimates rho online and switches acquisition accordingly, wins all 16 HPO-B search spaces at B=100 and exceeds the matched {Greedy,UCB} per-context oracle on GDSC2 by 18%. Pre-registered predictions achieve 27/40=67.5% overall accuracy and above 90% within EMA prior families. The practical protocol is simple: report B/|A|, rho, K, and metric alongside any claimed acquisition advantage.

benchmark, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2605.04895

Country: Asia > Middle East > UAE (0.27)

Genre: Research Report > New Finding (0.92)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.88)

Add feedback

the Hamiltonian bound

Neural Information Processing SystemsApr-24-2026, 11:51:17 GMT

Algorithm 6 Generating the (non-differentiable) Hamiltonian AIS variational bound. Figure 1 shows the results. The first row shows the results obtained by tuning the pair (,η) and each other parameter individually for different values of K, and the second row shows the results obtained by tuning increasingly more parameters. It can be observed that tuning β and q(z) lead to the largest gains in performance. Figure 4: Tuning more parameters leads to significantly better results.

artificial intelligence, machine learning, simulation, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.70)

Add feedback

Post-Hoc Reversal: Are We Selecting Models Prematurely?

Neural Information Processing SystemsMar-21-2026, 22:53:52 GMT

Trained models are often composed with post-hoc transforms such as temperature scaling (TS), ensembling and stochastic weight averaging (SWA) to improve performance, robustness, uncertainty estimation, etc. However, such transforms are typically applied only after the base models have already been finalized by standard means. In this paper, we challenge this practice with an extensive empirical study. In particular, we demonstrate a phenomenon that we call post-hoc reversal, where performance trends are reversed after applying post-hoc transforms. This phenomenon is especially prominent in high-noise settings.

artificial intelligence, name change, proceedings, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.36)

Add feedback

From deep learning to mechanistic understanding in neuroscience: the structure of retinal prediction

Hidenori Tanaka, Aran Nayebi, Niru Maheswaranathan, Lane McIntosh, Stephen Baccus, Surya Ganguli

Neural Information Processing SystemsFeb-15-2026, 00:23:50 GMT

For eachstimulus,ourextractedcomputational mechanisms areconsistent withprior scientific literature, and in one case yields a new mechanistic hypothesis.

artificial intelligence, deep learning, machine learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Stanford (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Score-BasedDiffusionmeets AnnealedImportanceSampling

Neural Information Processing SystemsFeb-10-2026, 12:01:36 GMT

More than twenty years after its introduction, Annealed Importance Sampling (AIS) remains oneofthemost effectivemethods formarginal likelihood estimation.

artificial intelligence, machine learning, tk 1, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

Permutation-based Causal Inference Algorithms with Interventions

Yuhao Wang, Liam Solus, Karren Yang, Caroline Uhler

Neural Information Processing SystemsNov-21-2025, 06:01:52 GMT

In this paper, we present two algorithms of this type and prove that both are consistent under the faithfulness assumption.

artificial intelligence, bayesian inference, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (0.47)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Add feedback

05f971b5ec196b8c65b75d2ef8267331-Supplemental.pdf

Neural Information Processing SystemsNov-13-2025, 06:51:33 GMT

artificial intelligence, machine learning, simulation, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.70)

Add feedback

On the generalization of language models from in-context learning and finetuning: a controlled study

Lampinen, Andrew K., Chaudhry, Arslan, Chan, Stephanie C. Y., Wild, Cody, Wan, Diane, Ku, Alex, Bornschein, Jörg, Pascanu, Razvan, Shanahan, Murray, McClelland, James L.

arXiv.org Artificial IntelligenceNov-12-2025

Large language models exhibit exciting capabilities, yet can show surprisingly narrow generalization from finetuning. E.g. they can fail to generalize to simple reversals of relations they are trained on, or fail to make simple logical deductions based on trained information. These failures to generalize factual information from fine-tuning can significantly hinder the reasoning capabilities of these models. On the other hand, language models' in-context learning (ICL) shows different inductive biases and deductive reasoning capabilities. Here, we explore these differences in generalization and deductive reasoning between in-context- and fine-tuning-based learning. To do so, we constructed several novel datasets to evaluate and improve models' abilities to make generalizations over factual information from novel data. These datasets are designed to create clean tests of generalization, by isolating the knowledge in the dataset from that in pretraining. We expose pretrained large models to controlled subsets of the information in these datasets -- either through ICL or fine-tuning -- and evaluate their performance on test sets that require various types of generalization. We find overall that in data-matched settings, ICL can generalize several types of inferences more flexibly than fine-tuning (though we also find some qualifications of prior findings, such as cases when fine-tuning can generalize to reversals embedded in a larger structure of knowledge). We build on these findings to propose a method to enable improved generalization from fine-tuning: adding in-context reasoning traces to finetuning data. We show that this method improves generalization across various splits of our datasets and other benchmarks. Our results have implications for understanding the generalization afforded by different modes of learning in language models, and practically improving their performance.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.00661

Country: North America > United States > California (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Sensitivity of Small Language Models to Fine-tuning Data Contamination

Scaria, Nicy, Kennedy, Silvester John Joseph, Subramani, Deepak

arXiv.org Artificial IntelligenceNov-11-2025

Small Language Models (SLMs) are increasingly being deployed in resource-constrained environments, yet their behavioral robustness to data contamination during instruction tuning remains poorly understood. We systematically investigate the contamination sensitivity of 23 SLMs (270M to 4B parameters) across multiple model families by measuring susceptibility to syntactic and semantic transformation types during instruction tuning: syntactic transformations (character and word reversal) and semantic transformations (irrelevant and counterfactual responses), each applied at contamination levels of 25\%, 50\%, 75\%, and 100\%. Our results reveal fundamental asymmetries in vulnerability patterns: syntactic transformations cause catastrophic performance degradation, with character reversal producing near-complete failure across all models regardless of size or family, while semantic transformations demonstrate distinct threshold behaviors and greater resilience in core linguistic capabilities. Critically, we discover a ``\textit{capability curse}" where larger, more capable models become more susceptible to learning semantic corruptions, effectively following harmful instructions more readily, while our analysis of base versus instruction-tuned variants reveals that alignment provides inconsistent robustness benefits, sometimes even reducing resilience. Our work establishes three core contributions: (1) empirical evidence of SLMs' disproportionate vulnerability to syntactic pattern contamination, (2) identification of asymmetric sensitivity patterns between syntactic and semantic transformations, and (3) systematic evaluation protocols for contamination robustness assessment. These findings have immediate deployment implications, suggesting that current robustness assumptions may not hold for smaller models and highlighting the need for contamination-aware training protocols.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2511.06763

Country:

Europe (0.68)
Asia (0.67)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Filters

Collaborating Authors

reversal

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Regime-Conditioned Evaluation in Multi-Context Bayesian Optimization

the Hamiltonian bound

Post-Hoc Reversal: Are We Selecting Models Prematurely?

a9b50b051cd5f17455458d922e52768b-Paper-Conference.pdf

From deep learning to mechanistic understanding in neuroscience: the structure of retinal prediction

Score-BasedDiffusionmeets AnnealedImportanceSampling

Permutation-based Causal Inference Algorithms with Interventions

05f971b5ec196b8c65b75d2ef8267331-Supplemental.pdf

On the generalization of language models from in-context learning and finetuning: a controlled study

Sensitivity of Small Language Models to Fine-tuning Data Contamination