AITopics | regularization scheme

Collaborating Authors

regularization scheme

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Appendix ASource codes

Neural Information Processing SystemsApr-24-2026, 21:48:38 GMT

Source codes for reproducing our experimental results are available at https://github.com/ We utilize DQNReplay dataset5 [1] for expert demonstrations on 27 Atari environments [5]. To encourage the size of the dataset to be consistent across multiple environments, we use the number of expert demonstrations N 2{ 20,50}. We provide the size of a dataset for each environment in Table 4. We process input images to grayscale images of 84 84 1, by utilizing Dopamine library6 [9].

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Add feedback

109cf25cbc36037deecdbeabfa199956-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 15:32:13 GMT

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Genre: Research Report (0.92)

Industry:

Leisure & Entertainment (0.92)
Information Technology (0.67)
Media > Film (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Communications (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(2 more...)

Add feedback

BeyondTikhonov: FasterLearningwith Self-ConcordantLossesviaIterativeRegularization

Neural Information Processing SystemsFeb-11-2026, 18:56:29 GMT

artificial intelligence, estimator, machine learning, (18 more...)

Neural Information Processing Systems

Country: Europe > France (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Beyond Tikhonov: faster learning with self-concordant losses, via iterative regularization

Neural Information Processing SystemsDec-25-2025, 04:40:26 GMT

The theory of spectral filtering is a remarkable tool to understand the statistical properties of learning with kernels. For least squares, it allows to derive various regularization schemes that yield faster convergence rates of the excess risk than with Tikhonov regularization. This is typically achieved by leveraging classical assumptions called source and capacity conditions, which characterize the difficulty of the learning task. In order to understand estimators derived from other loss functions, Marteau-Ferey et al. have extended the theory of Tikhonov regularization to generalized self concordant loss functions (GSC), which contain, e.g., the logistic loss. In this paper, we go a step further and show that fast and optimal rates can be achieved for GSC by using the iterated Tikhonov regularization scheme, which is intrinsically related to the proximal point method in optimization, and overcomes the limitation of the classical Tikhonov regularization.

name change, regularization, self-concordant loss, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

Decreasing Entropic Regularization Averaged Gradient for Semi-Discrete Optimal Transport

Genans, Ferdinand, Godichon-Baggioni, Antoine, Vialard, François-Xavier, Wintenberger, Olivier

arXiv.org Machine LearningNov-3-2025

Adding entropic regularization to Optimal Transport (OT) problems has become a standard approach for designing efficient and scalable solvers. However, regularization introduces a bias from the true solution. To mitigate this bias while still benefiting from the acceleration provided by regularization, a natural solver would adaptively decrease the regularization as it approaches the solution. Although some algorithms heuristically implement this idea, their theoretical guarantees and the extent of their acceleration compared to using a fixed regularization remain largely open. In the setting of semi-discrete OT, where the source measure is continuous and the target is discrete, we prove that decreasing the regularization can indeed accelerate convergence. To this end, we introduce DRAG: Decreasing (entropic) Regularization Averaged Gradient, a stochastic gradient descent algorithm where the regularization decreases with the number of optimization steps. We provide a theoretical analysis showing that DRAG benefits from decreasing regularization compared to a fixed scheme, achieving an unbiased $\mathcal{O}(1/t)$ sample and iteration complexity for both the OT cost and the potential estimation, and a $\mathcal{O}(1/\sqrt{t})$ rate for the OT map. Our theoretical findings are supported by numerical experiments that validate the effectiveness of DRAG and highlight its practical advantages.

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Machine Learning

2510.2734

Country:

North America > United States (0.04)
Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.68)

Add feedback

Enhancing the Cross-Size Generalization for Solving Vehicle Routing Problems via Continual Learning

Li, Jingwen, Cao, Zhiguang, Wu, Yaoxin, Liu, Tang

arXiv.org Artificial IntelligenceOct-21-2025

Exploring machine learning techniques for addressing vehicle routing problems has attracted considerable research attention. To achieve decent and efficient solutions, existing deep models for vehicle routing problems are typically trained and evaluated using instances of a single size. This substantially limits their ability to generalize across different problem sizes and thus hampers their practical applicability. To address the issue, we propose a continual learning based framework that sequentially trains a deep model with instances of ascending problem sizes. Specifically, on the one hand, we design an inter-task regularization scheme to retain the knowledge acquired from smaller problem sizes in the model training on a larger size. On the other hand, we introduce an intra-task regularization scheme to consolidate the model by imitating the latest desirable behaviors during training on each size. Additionally, we exploit the experience replay to revisit instances of formerly trained sizes for mitigating the catastrophic forgetting. Experimental results show that our approach achieves predominantly superior performance across various problem sizes (either seen or unseen in the training), as compared to state-of-the-art deep models including the ones specialized for generalizability enhancement. Meanwhile, the ablation studies on the key designs manifest their synergistic effect in the proposed framework.

artificial intelligence, deep model, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2510.10262

Genre: Research Report (0.84)

Industry: Transportation > Freight & Logistics Services (0.92)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Stabilizing PINNs: A regularization scheme for PINN training to avoid unstable fixed points of dynamical systems

Babic, Milos, Rohrhofer, Franz M., Geiger, Bernhard C.

arXiv.org Artificial IntelligenceSep-16-2025

ABSTRACT It was recently shown that the loss function used for training physics-informed neural networks (PINNs) exhibits local minima at solutions corresponding to fixed points of dynamical systems. In the forward setting, where the PINN is trained to solve initial value problems, these local minima can interfere with training and potentially leading to physically incorrect solutions. Building on stability theory, this paper proposes a regularization scheme that penalizes solutions corresponding to unstable fixed points. Experimental results on four dynamical systems, including the Lotka-V olterra model and the van der Pol oscillator, show that our scheme helps avoiding physically incorrect solutions and substantially improves the training success rate of PINNs. Index T erms-- PINNs, regularization, stability 1. INTRODUCTION Physics-informed neural networks (PINNs, [1]) are among the most prominent instantiations of physics-informed machine learning.

artificial intelligence, machine learning, regularization scheme, (17 more...)

arXiv.org Artificial Intelligence

2509.11768

Country:

Europe (0.29)
North America > United States (0.28)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Who Does What in Deep Learning? Multidimensional Game-Theoretic Attribution of Function of Neural Units

Dixit, Shrey, Fakhar, Kayson, Hadaeghi, Fatemeh, Mineault, Patrick, Kording, Konrad P., Hilgetag, Claus C.

arXiv.org Artificial IntelligenceJun-25-2025

Neural networks now generate text, images, and speech with billions of parameters, producing a need to know how each neural unit contributes to these high-dimensional outputs. Existing explainable-AI methods, such as SHAP, attribute importance to inputs, but cannot quantify the contributions of neural units across thousands of output pixels, tokens, or logits. Here we close that gap with Multiperturbation Shapley-value Analysis (MSA), a model-agnostic game-theoretic framework. By systematically lesioning combinations of units, MSA yields Shapley Modes, unit-wise contribution maps that share the exact dimensionality of the model's output. We apply MSA across scales, from multi-layer perceptrons to the 56-billion-parameter Mixtral-8x7B and Generative Adversarial Networks (GAN). The approach demonstrates how regularisation concentrates computation in a few hubs, exposes language-specific experts inside the LLM, and reveals an inverted pixel-generation hierarchy in GANs. Together, these results showcase MSA as a powerful approach for interpreting, editing, and compressing deep neural networks.

artificial intelligence, contribution, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2506.19732

Country:

Europe (1.00)
North America > Canada (0.93)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reviews: Adapting Neural Networks for the Estimation of Treatment Effects

Neural Information Processing SystemsJan-25-2025, 15:57:40 GMT

The paper addresses the problem of inferring causal effects using observational data, under the "no-hidden confounders" scenario. Recently there has been much interest in the problem from the machine learning community, including several papers proposing neural net architectures tailored for this problem. This paper proposes a new regularization scheme for this task. The idea is inspired by TMLE, a well known method for doubly-robust estimation of treatment effects. However, TMLE is only an inspiration - the regularization scheme and resulting architecture are distinct and novel.

author response, reviewer, treatment effect, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)

Add feedback

Filters

Collaborating Authors

regularization scheme

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Appendix ASource codes

109cf25cbc36037deecdbeabfa199956-Supplemental-Conference.pdf

BeyondTikhonov: FasterLearningwith Self-ConcordantLossesviaIterativeRegularization

Beyond Tikhonov: faster learning with self-concordant losses, via iterative regularization

Decreasing Entropic Regularization Averaged Gradient for Semi-Discrete Optimal Transport

Enhancing the Cross-Size Generalization for Solving Vehicle Routing Problems via Continual Learning

Stabilizing PINNs: A regularization scheme for PINN training to avoid unstable fixed points of dynamical systems

eda80a3d5b344bc40f3bc04f65b7a357-Paper.pdf

Who Does What in Deep Learning? Multidimensional Game-Theoretic Attribution of Function of Neural Units

Reviews: Adapting Neural Networks for the Estimation of Treatment Effects