AITopics | penalization

Collaborating Authors

penalization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Large language models can learn and generalize steganographic chain-of-thought under process supervision

Skaf, Joey, Ibanez-Lissen, Luis, McCarthy, Robert, Watts, Connor, Georgiv, Vasil, Whittingham, Hannes, Gonzalez-Manzano, Lorena, Lindner, David, Tice, Cameron, Young, Edward James, Radmard, Puria

arXiv.org Artificial IntelligenceDec-5-2025

Chain-of-thought (CoT) reasoning not only enhances large language model performance but also provides critical insights into decision-making processes, marking it as a useful tool for monitoring model intent and planning. However, recent works have shown that banning the mention of a specific example of reward hacking causes obfuscation of the undesired reasoning traces but the persistence of the undesired behavior, threatening the reliability of CoT monitoring. We provide an extension to these results with regard to the ability of models to learn a specific type of obfuscated reasoning: steganography. First, we show that penalizing the use of specific strings within load-bearing reasoning traces causes models to substitute alternative strings. Crucially, this does not alter the underlying method by which the model performs the task, demonstrating that the model can learn to steganographically encode its reasoning.We further demonstrate that models can generalize an encoding scheme. When the penalized strings belong to an overarching class, the model learns not only to substitute strings seen in training, but also develops a general encoding scheme for all members of the class which it can apply to held-out testing strings.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2506.01926

Country:

Europe > Spain > Galicia > Madrid (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Africa > Cameroon > Gulf of Guinea (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Optimization over Continuous and Multi-dimensional Decisions with Observational Data

Dimitris Bertsimas, Christopher McCord

Neural Information Processing SystemsNov-20-2025, 21:11:52 GMT

We consider the optimization of an uncertain objective over continuous and multidimensional decision spaces in problems in which we are only provided with observational data.

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.69)

Industry:

Health & Medicine > Therapeutic Area (0.30)
Health & Medicine > Pharmaceuticals & Biotechnology (0.30)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.48)

Add feedback

Optimization over Continuous and Multi-dimensional Decisions with Observational Data

Dimitris Bertsimas, Christopher McCord

Neural Information Processing SystemsNov-19-2025, 01:13:32 GMT

We consider the optimization of an uncertain objective over continuous and multidimensional decision spaces in problems in which we are only provided with observational data.

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.69)

Industry:

Health & Medicine > Therapeutic Area (0.30)
Health & Medicine > Pharmaceuticals & Biotechnology (0.30)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.48)

Add feedback

We thank the reviewers for their thoughtful comments and suggestions and we respond below to some concrete 1 questions/comments that were raised

Neural Information Processing SystemsNov-15-2025, 19:53:23 GMT

The setup there is also a special case of our setup, where the reward is linear in the treatment vector, i.e. Hence, we omitted such an analysis. We will add some more elaborate discussion on these rates expanding on Remark 4. See response also to Reviewer #4 regarding efficiency. These extra moments can be used to construct more efficient estimators. However, these extra moment conditions have no bite in the case of homoskedastic noise.

artificial intelligence, machine learning, penalization, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.33)

Add feedback

Environment-aware Motion Matching

Ponton, Jose Luis, Andrews, Sheldon, Andujar, Carlos, Pelechano, Nuria

arXiv.org Artificial IntelligenceOct-28-2025

Interactive applications demand believable characters that respond naturally to dynamic environments. Traditional character animation techniques often struggle to handle arbitrary situations, leading to a growing trend of dynamically selecting motion-captured animations based on predefined features. While Motion Matching has proven effective for locomotion by aligning to target trajectories, animating environment interactions and crowd behaviors remains challenging due to the need to consider surrounding elements. Existing approaches often involve manual setup or lack the naturalism of motion capture. Furthermore, in crowd animation, body animation is frequently treated as a separate process from trajectory planning, leading to inconsistencies between body pose and root motion. To address these limitations, we present Environment-aware Motion Matching, a novel real-time system for full-body character animation that dynamically adapts to obstacles and other agents, emphasizing the bidirectional relationship between pose and trajectory. In a preprocessing step, we extract shape, pose, and trajectory features from a motion capture database. At runtime, we perform an efficient search that matches user input and current pose while penalizing collisions with a dynamic environment. Our method allows characters to naturally adjust their pose and trajectory to navigate crowded scenes.

machine learning, obstacle, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3763334

2510.22632

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > New York > New York County > New York City (0.05)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(13 more...)

Genre: Research Report (0.63)

Industry: Leisure & Entertainment > Games > Computer Games (0.67)

Technology:

Information Technology > Graphics > Animation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Complexity Dependent Error Rates for Physics-informed Statistical Learning via the Small-ball Method

Marcondes, Diego

arXiv.org Machine LearningOct-28-2025

Physics-informed statistical learning (PISL) integrates empirical data with physical knowledge to enhance the statistical performance of estimators. While PISL methods are widely used in practice, a comprehensive theoretical understanding of how informed regularization affects statistical properties is still missing. Specifically, two fundamental questions have yet to be fully addressed: (1) what is the trade-off between considering soft penalties versus hard constraints, and (2) what is the statistical gain of incorporating physical knowledge compared to purely data-driven empirical error minimisation. In this paper, we address these questions for PISL in convex classes of functions under physical knowledge expressed as linear equations by developing appropriate complexity dependent error rates based on the small-ball method. We show that, under suitable assumptions, (1) the error rates of physics-informed estimators are comparable to those of hard constrained empirical error minimisers, differing only by constant terms, and that (2) informed penalization can effectively reduce model complexity, akin to dimensionality reduction, thereby improving learning performance. This work establishes a theoretical framework for evaluating the statistical properties of physics-informed estimators in convex classes of functions, contributing to closing the gap between statistical theory and practical PISL, with potential applications to cases not yet explored in the literature.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Machine Learning

2510.23149

Country:

South America > Brazil > São Paulo (0.04)
Oceania > Australia (0.04)
North America > United States > Texas (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Online Policy Learning via a Self-Normalized Maximal Inequality

Girard, Samuel, Bibaut, Aurélien, Zenati, Houssam

arXiv.org Machine LearningOct-20-2025

Adaptive experiments produce dependent data that break i.i.d. assumptions that underlie classical concentration bounds and invalidate standard learning guarantees. In this paper, we develop a self-normalized maximal inequality for martingale empirical processes. Building on this, we first propose an adaptive sample-variance penalization procedure which balances empirical loss and sample variance, valid for general dependent data. Next, this allows us to derive a new variance-regularized pessimistic off-policy learning objective, for which we establish excess-risk guarantees. Subsequently, we show that, when combined with sequential updates and under standard complexity and margin conditions, the resulting estimator achieves fast convergence rates in both parametric and nonparametric regimes, improving over the usual $1/\sqrt{n}$ baseline. We complement our theoretical findings with numerical simulations that illustrate the practical gains of our approach.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

2510.15483

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy > Sardinia (0.04)

Genre:

Research Report > New Finding (0.46)
Instructional Material > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

08b7dc6e8b36bcaac15847827b7951a9-AuthorFeedback.pdf

Neural Information Processing SystemsOct-2-2025, 00:43:12 GMT

artificial intelligence, machine learning, penalization, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.33)

Add feedback

THINNs: Thermodynamically Informed Neural Networks

Castro, Javier, Gess, Benjamin

arXiv.org Artificial IntelligenceSep-25-2025

Physics-Informed Neural Networks (PINNs) are a class of deep learning models aiming to approximate solutions of PDEs by training neural networks to minimize the residual of the equation. Focusing on non-equilibrium fluctuating systems, we propose a physically informed choice of penalization that is consistent with the underlying fluctuation structure, as characterized by a large deviations principle. This approach yields a novel formulation of PINNs in which the penalty term is chosen to penalize improbable deviations, rather than being selected heuristically. The resulting thermodynamically consistent extension of PINNs, termed THINNs, is subsequently analyzed by establishing analytical a posteriori estimates, and providing empirical comparisons to established penalization strategies.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2509.19467

Country:

Europe > Germany > Berlin (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
Europe > Italy > Sardinia (0.04)
(2 more...)

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Robust Data Fusion via Subsampling

Wang, Jing, Wang, HaiYing, Chen, Kun

arXiv.org Machine LearningAug-19-2025

Data fusion and transfer learning are rapidly growing fields that enhance model performance for a target population by leveraging other related data sources or tasks. The challenges lie in the various potential heterogeneities between the target and external data, as well as various practical concerns that prevent a naïve data integration. We consider a realistic scenario where the target data is limited in size while the external data is large but contaminated with outliers; such data contamination, along with other computational and operational constraints, necessitates proper selection or subsampling of the external data for transfer learning. To our knowledge,transfer learning and subsampling under data contamination have not been thoroughly investigated. We address this gap by studying various transfer learning methods with subsamples of the external data, accounting for outliers deviating from the underlying true model due to arbitrary mean shifts. Two subsampling strategies are investigated: one aimed at reducing biases and the other at minimizing variances. Approaches to combine these strategies are also introduced to enhance the performance of the estimators. We provide non-asymptotic error bounds for the transfer learning estimators, clarifying the roles of sample sizes, signal strength, sampling rates, magnitude of outliers, and tail behaviors of model error distributions, among other factors. Extensive simulations show the superior performance of the proposed methods. Additionally, we apply our methods to analyze the risk of hard landings in A380 airplanes by utilizing data from other airplane types,demonstrating that robust transfer learning can improve estimation efficiency for relatively rare airplane types with the help of data from other types of airplanes.

artificial intelligence, information fusion, machine learning, (19 more...)

arXiv.org Machine Learning

2508.12048

Country: North America > United States > Connecticut (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Air (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback