AITopics | Optimization

Collaborating Authors

Optimization

News Overviews Instructional Materials AI-Alerts Classics

Inverse Rational Control with Partially Observable Continuous Nonlinear Dynamics

Neural Information Processing SystemsOct-3-2025, 00:03:50 GMT

However, animals often appear to behave suboptimally.

agent, internal model, model parameter, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota (0.04)
North America > United States > Texas > Harris County > Houston (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(3 more...)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(3 more...)

Add feedback

Off-Policy Interval Estimation with Lipschitz Value Iteration

Neural Information Processing SystemsOct-3-2025, 00:03:19 GMT

Reinforcement learning (RL) (e.g., Sutton & Barto, 1998) has become widely used in tasks like Li, 2016; Liu et al., 2018a), estimating the expected reward of a target policy using observational data gathered from previous behavior policies, therefore holds tremendous promise for designing Our method is efficient and provably convergent. Our work is closely related to the off-policy point estimation.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Texas (0.14)

Industry: Health & Medicine (0.75)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-3-2025, 00:02:05 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. Summary: The authors re-explain regularization in optimization problems as a constraint of the type the parameters ${\bf w}$ must belong to the convex set $O$ where the convex set O is obtained as the convex hull of all the points of the form $g.v$ where $v$ is some fix vector, $g$ an element from a group and $.$ is a (linear) group action of element $g$ on vector $v$. More concretely, their main contributions are as follows. For example, the ball associated to the L1 norm can be explained as the convex hull of the points obtained by flipping the sign and permuting the components of the vector $(1,0,0,..,0)$; (B) they show that given a seed $v$ and a group action associated to a group $G$, the notion of $w$ is a member of the convex set $O_G(v)$ can be seen as $v$ is smaller than $w$ under a pre-order; (C) they show that if $-v$ belongs to convex set $O$ then $O$ can be seen as the ball of an atomic norm (as defined in Chandra et al.); (D) they show that the L1-sorted norm equals the dual of the norm associated to the signed-pertumation orbitope; (E) they show how to reinterpret the main steps of conditional and projected gradient algorithms in the language of orbitopes and give a procedure to compute projections onto orbitopes. Quality: There are no technical mistakes in the paper.

algorithm, continuation algorithm, regularization, (12 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Personal (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

Distributional Policy Optimization: An Alternative Approach for Continuous Control

Chen Tessler, Guy Tennenholtz, Shie Mannor

Neural Information Processing SystemsOct-3-2025, 00:01:46 GMT

Neural Information Processing Systems http://nips.cc/

Add feedback

Drop-Muon: Update Less, Converge Faster

Gruntkowska, Kaja, Maziane, Yassine, Qu, Zheng, Richtárik, Peter

arXiv.org Machine LearningOct-3-2025

Conventional wisdom in deep learning optimization dictates updating all layers at every step-a principle followed by all recent state-of-the-art optimizers such as Muon. In this work, we challenge this assumption, showing that full-network updates can be fundamentally suboptimal, both in theory and in practice. We introduce a non-Euclidean Randomized Progressive Training method-Drop-Muon-a simple yet powerful framework that updates only a subset of layers per step according to a randomized schedule, combining the efficiency of progressive training with layer-specific non-Euclidean updates for top-tier performance. We provide rigorous convergence guarantees under both layer-wise smoothness and layer-wise $(L^0, L^1)$-smoothness, covering deterministic and stochastic gradient settings, marking the first such results for progressive training in the stochastic and non-smooth regime. Our cost analysis further reveals that full-network updates are not optimal unless a very specific relationship between layer smoothness constants holds. Through controlled CNN experiments, we empirically demonstrate that Drop-Muon consistently outperforms full-network Muon, achieving the same accuracy up to $1.4\times$ faster in wall-clock time. Together, our results suggest a shift in how large-scale models can be efficiently trained, challenging the status quo and offering a highly efficient, theoretically grounded alternative to full-network updates.

assumption 4, drop-muon, iteration, (14 more...)

arXiv.org Machine Learning

2510.02239

Country:

Asia > Middle East > Jordan (0.04)
Asia > Middle East > Saudi Arabia > Mecca Province > Thuwal (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > New Finding (0.85)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)

Add feedback

Flatness-Aware Stochastic Gradient Langevin Dynamics

Bruno, Stefano, Hwang, Youngsik, An, Jaehyeon, Sabanis, Sotirios, Lim, Dong-Young

arXiv.org Machine LearningOct-3-2025

Generalization in deep learning is closely tied to the pursuit of flat minima in the loss landscape, yet classical Stochastic Gradient Langevin Dynamics (SGLD) offers no mechanism to bias its dynamics toward such low-curvature solutions. This work introduces Flatness-Aware Stochastic Gradient Langevin Dynamics (fSGLD), designed to efficiently and provably seek flat minima in high-dimensional nonconvex optimization problems. At each iteration, fSGLD uses the stochastic gradient evaluated at parameters perturbed by isotropic Gaussian noise, commonly referred to as Random Weight Perturbation (RWP), thereby optimizing a randomized-smoothing objective that implicitly captures curvature information. Leveraging these properties, we prove that the invariant measure of fSGLD stays close to a stationary measure concentrated on the global minimizers of a loss function regularized by the Hessian trace whenever the inverse temperature and the scale of random weight perturbation are properly coupled. This result provides a rigorous theoretical explanation for the benefits of random weight perturbation. In particular, we establish non-asymptotic convergence guarantees in Wasserstein distance with the best known rate and derive an excess-risk bound for the Hessian-trace regularized objective. Extensive experiments on noisy-label and large-scale vision tasks, in both training-from-scratch and fine-tuning settings, demonstrate that fSGLD achieves superior or comparable generalization and robustness to baseline algorithms while maintaining the computational cost of SGD, about half that of SAM. Hessian-spectrum analysis further confirms that fSGLD converges to significantly flatter minima.

assumption 1, fsgld, international conference, (13 more...)

arXiv.org Machine Learning

2510.02174

Country:

Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
Europe > Greece > Attica > Athens (0.04)
Asia > South Korea > Ulsan > Ulsan (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)

Add feedback

Lower Bounds on Adversarial Robustness for Multiclass Classification with General Loss Functions

Trillos, Camilo Andrés García, Trillos, Nicolás García

arXiv.org Machine LearningOct-3-2025

We consider adversarially robust classification in a multiclass setting under arbitrary loss functions and derive dual and barycentric reformulations of the corresponding learner-agnostic robust risk minimization problem. We provide explicit characterizations for important cases such as the cross-entropy loss, loss functions with a power form, and the quadratic loss, extending in this way available results for the 0-1 loss. These reformulations enable efficient computation of sharp lower bounds for adversarial risks and facilitate the design of robust classifiers beyond the 0-1 loss setting. Our paper uncovers interesting connections between adversarial robustness, $α$-fair packing problems, and generalized barycenter problems for arbitrary positive measures where Kullback-Leibler and Tsallis entropies are used as penalties. Our theoretical results are accompanied with illustrative numerical experiments where we obtain tighter lower bounds for adversarial risks with the cross-entropy loss function.

classifier, cross-entropy loss, loss function, (16 more...)

arXiv.org Machine Learning

2510.01969

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Deep Hedging Under Non-Convexity: Limitations and a Case for AlphaZero

Maggiolo, Matteo, Nuti, Giuseppe, Štrupl, Miroslav, Szehr, Oleg

arXiv.org Machine LearningOct-3-2025

This paper examines replication portfolio construction in incomplete markets - a key problem in financial engineering with applications in pricing, hedging, balance sheet management, and energy storage planning. We model this as a two-player game between an investor and the market, where the investor makes strategic bets on future states while the market reveals outcomes. Inspired by the success of Monte Carlo Tree Search in stochastic games, we introduce an AlphaZero-based system and compare its performance to deep hedging - a widely used industry method based on gradient descent. Through theoretical analysis and experiments, we show that deep hedging struggles in environments where the $Q$-function is not subject to convexity constraints - such as those involving non-convex transaction costs, capital constraints, or regulatory limitations - converging to local optima. We construct specific market environments to highlight these limitations and demonstrate that AlphaZero consistently finds near-optimal replication strategies. On the theoretical side, we establish a connection between deep hedging and convex optimization, suggesting that its effectiveness is contingent on convexity assumptions. Our experiments further suggest that AlphaZero is more sample-efficient - an important advantage in data-scarce, overfitting-prone derivative markets.

alphazero, deep hedging, non-convexity, (14 more...)

arXiv.org Machine Learning

2510.01874

Country:

Europe > Switzerland (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre:

Research Report (1.00)
Overview (0.87)

Industry:

Leisure & Entertainment > Games (1.00)
Energy (1.00)
Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

A reproducible comparative study of categorical kernels for Gaussian process regression, with new clustering-based nested kernels

Perez, Raphaël Carpintero, Da Veiga, Sébastien, Garnier, Josselin

arXiv.org Machine LearningOct-3-2025

Designing categorical kernels is a major challenge for Gaussian process regression with continuous and categorical inputs. Despite previous studies, it is difficult to identify a preferred method, either because the evaluation metrics, the optimization procedure, or the datasets change depending on the study. In particular, reproducible code is rarely available. The aim of this paper is to provide a reproducible comparative study of all existing categorical kernels on many of the test cases investigated so far. We also propose new evaluation metrics inspired by the optimization community, which provide quantitative rankings of the methods across several tasks. From our results on datasets which exhibit a group structure on the levels of categorical inputs, it appears that nested kernels methods clearly outperform all competitors. When the group structure is unknown or when there is no prior knowledge of such a structure, we propose a new clustering-based strategy using target encodings of categorical variables. We show that on a large panel of datasets, which do not necessarily have a known group structure, this estimation strategy still outperforms other approaches while maintaining low computational cost.

dataset, experiment, kernel, (14 more...)

arXiv.org Machine Learning

2510.0184

Country:

Europe > United Kingdom > Scotland > City of Glasgow > Glasgow (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Brittany > Ille-et-Vilaine > Rennes (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback

Enhancing Electricity-System Resilience with Adaptive Robust Optimization and Conformal Uncertainty Characterization

Chen, Shuyi, Zhu, Shixiang, Sioshansi, Ramteen

arXiv.org Artificial IntelligenceOct-3-2025

Extreme weather is straining electricity systems, exposing the limitations of reactive responses, and prompting the need for proactive resilience planning. Most existing approaches to enhance electricity system resilience employ simplified uncertainty models and decouple proactive and reactive decisions. This paper proposes a novel tri-level optimization model that integrates proactive actions, adversarial disruptions, and reactive responses. Conformal prediction is used to construct distribution-free system-disruption uncertainty sets with coverage guarantees. The tri-level problem is solved by using duality theory to derive a bi-level reformulation and employing Bender's decomposition. Numerical experiments demonstrate that our approach outperforms conventional robust and two-stage methods.

artificial intelligence, optimization problem, tri-level model, (16 more...)

arXiv.org Artificial Intelligence

2505.11627

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report (0.64)

Industry:

Transportation > Ground > Road (0.68)
Energy > Power Industry > Utilities (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback