AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.32)

Neural Information Processing SystemsFeb-11-2026, 04:01:27 GMT

c9dd73f5cb96486f5e1e0680e841a550-Paper.pdf

dmm, neural network, stability, (13 more...)

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Washington > Benton County > Richland (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(2 more...)

Genre: Research Report (0.94)

Industry:

Energy (1.00)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Chakraborty, Abhinav, Luo, Yuetian, Barber, Rina Foygel

Stability and Accuracy Trade-offs in Statistical Estimation

arXiv.org Machine LearningJan-21-2026

Algorithmic stability is a central concept in statistics and learning theory that measures how sensitive an algorithm's output is to small changes in the training data. Stability plays a crucial role in understanding generalization, robustness, and replicability, and a variety of stability notions have been proposed in different learning settings. However, while stability entails desirable properties, it is typically not sufficient on its own for statistical learning -- and indeed, it may be at odds with accuracy, since an algorithm that always outputs a constant function is perfectly stable but statistically meaningless. Thus, it is essential to understand the potential statistical cost of stability. In this work, we address this question by adopting a statistical decision-theoretic perspective, treating stability as a constraint in estimation. Focusing on two representative notions-worst-case stability and average-case stability-we first establish general lower bounds on the achievable estimation accuracy under each type of stability constraint. We then develop optimal stable estimators for four canonical estimation problems, including several mean estimation and regression settings. Together, these results characterize the optimal trade-offs between stability and accuracy across these tasks. Our findings formalize the intuition that average-case stability imposes a qualitatively weaker restriction than worst-case stability, and they further reveal that the gap between these two can vary substantially across different estimation problems.

artificial intelligence, machine learning, stability, (18 more...)

arXiv.org Machine Learning

2601.11701

Country: North America > United States (0.45)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Khaniki, Mohammad Ali Labbaf, Taroodi, Fateme, Safizadeh, Benyamin

A Novel Multi-Timescale Stability-Preserving Hierarchical Reinforcement Learning Controller Framework for Adaptive Control in High-Dimensional Dynamical Systems

arXiv.org Artificial IntelligenceOct-28-2025

Controlling high-dimensional stochastic systems, critical in robotics, autonomous vehicles, and hyperchaotic systems, faces the curse of dimensionality, lacks temporal abstraction, and often fails to ensure stochastic stability. To overcome these limitations, this study introduces the Multi-Timescale Lyapunov-Constrained Hierarchical Reinforcement Learning (MTLHRL) framework. MTLHRL integrates a hierarchical policy within a semi-Markov Decision Process (SMDP), featuring a high-level policy for strategic planning and a low-level policy for reactive control, which effectively manages complex, multi-timescale decision-making and reduces dimensionality overhead. Stability is rigorously enforced using a neural Lyapunov function optimized via Lagrangian relaxation and multi-timescale actor-critic updates, ensuring mean-square boundedness or asymptotic stability in the face of stochastic dynamics. The framework promotes efficient and reliable learning through trust-region constraints and decoupled optimization. Extensive simulations on an 8D hyperchaotic system and a 5-DOF robotic manipulator demonstrate MTLHRL's empirical superiority. It significantly outperforms baseline methods in both stability and performance, recording the lowest error indices (e.g., Integral Absolute Error (IAE): 3.912 in hyperchaotic control and IAE: 1.623 in robotics), achieving faster convergence, and exhibiting superior disturbance rejection. MTLHRL offers a theoretically grounded and practically viable solution for robust control of complex stochastic systems.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

2510.2242

Country: Europe > United Kingdom (0.28)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Energy (1.00)
Banking & Finance > Trading (0.46)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
(3 more...)

arXiv.org Artificial IntelligenceOct-14-2025

One4Many-StablePacker: An Efficient Deep Reinforcement Learning Framework for the 3D Bin Packing Problem

Gao, Lei, Huang, Shihong, Wang, Shengjie, Ma, Hong, Zhang, Feng, Bao, Hengda, Chen, Qichang, Zhou, Weihua

The three-dimensional bin packing problem (3D-BPP) is widely applied in logistics and warehousing. Existing learning-based approaches often neglect practical stability-related constraints and exhibit limitations in generalizing across diverse bin dimensions. To address these limitations, we propose a novel deep reinforcement learning framework, One4Many-StablePacker (O4M-SP). The primary advantage of O4M-SP is its ability to handle various bin dimensions in a single training process while incorporating support and weight constraints common in practice. Our training method introduces two innovative mechanisms. First, it employs a weighted reward function that integrates loading rate and a new height difference metric for packing layouts, promoting improved bin utilization through flatter packing configurations. Second, it combines clipped policy gradient optimization with a tailored policy drifting method to mitigate policy entropy collapse, encouraging exploration at critical decision nodes during packing to avoid suboptimal solutions. Extensive experiments demonstrate that O4M-SP generalizes successfully across diverse bin dimensions and significantly outperforms baseline methods. Furthermore, O4M-SP exhibits strong practical applicability by effectively addressing packing scenarios with stability constraints.

dimension, machine learning, reinforcement learning, (16 more...)

2510.10057

Country: Asia > China (0.28)

Genre: Research Report (0.82)

Industry: Transportation (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsOct-9-2025, 16:19:11 GMT

On the Stochastic Stability of Deep Markov Models

In this paper, we provide sufficient conditions of DMM's stochastic stability as defined in the context of dynamical systems

artificial intelligence, machine learning, neural network, (14 more...)

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Washington > Benton County > Richland (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(2 more...)

Genre: Research Report (0.94)

Industry:

Energy (1.00)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Neural Information Processing SystemsOct-2-2025, 01:01:51 GMT

the one most critical of the paper, felt that the fundamental method we introduce here (that of constructing explicitly

We also appreciate that the reviewers, especially Reviewer 2, had some concerns with aspects of the papers as well. To each of these points, we'd like to make the following comments: We're happy to include the traditional (e.g., Tanh or LSTM) RNN for comparison, and will add this to the We didn't include Embed2Control-style comparisons, We can include a discussion and illustration of this in the revision. The "simple" model always refers to a simple feedforward network (with the same structure as We'll fully describe the video texture setup in the text (e.g., the source videos are actual videos of physical fire from Thanks for pointing out the confusion here, we'll clarify all of these. However, we'll certainly discuss this point more. We'll include all these details for the experiments (lack of space to desribe them all here).

artificial intelligence, fundamental method, machine learning, (11 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)

arXiv.org Artificial IntelligenceMay-9-2025

LAPSO: A Unified Optimization View for Learning-Augmented Power System Operations

Xu, Wangkun, Chu, Zhongda, Teng, Fei

--With the high penetration of renewables, traditional model-based power system operation is challenged to deliver economic, stable, and robust decisions. Machine learning has emerged as a powerful modeling tool for capturing complex dynamics to address these challenges. However, its separate design often lacks systematic integration with existing methods. T o fill the gap, this paper proposes a holistic framework of Learning-Augmented Power System Operations (LAPSO, pronounced as Lap-So). Adopting a native optimization perspective, LAPSO is centered on the operation stage and aims to break the boundary between temporally siloed power system tasks, such as forecast, operation and control, while unifying the objectives of machine learning and model-based optimizations at both training and inference stages. Systematic analysis and simulations demonstrate the effectiveness of applying LAPSO in designing new integrated algorithms, such as stability-constrained optimization (SCO) and objective-based forecasting (OBF), while enabling end-to-end tracing of different sources of uncertainties. In addition, a dedicated Python package-lapso is introduced to automatically augment existing power system optimization models with learnable components. All code and data are available at https://github.com/xuwkk/lapso_exp. Index T erms --Power system operation, machine learning, objective-based forecasting, stability-constrained optimization. A. Background and Motivation Power system decision-making consists of sequentially connected tasks, including modeling/forecasting, operation, and control (See Figure 1(a).) With the decarbonization need, traditional model-based approaches face significant challenges. For example, the increasing uncertainty associated with renewable generation undermines the reliability of deterministic forecasting and power system operation (PSO) [2]. Meanwhile, the declining share of inertia from synchronous generators (SGs) can cause grid instability [3].

artificial intelligence, constraint, machine learning, (13 more...)

2505.05203

Genre: Research Report (0.82)

Industry:

Machinery > Industrial Machinery (1.00)
Energy > Renewable (1.00)
Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Neural Information Processing SystemsJan-20-2025, 04:01:30 GMT

Reviews: Learning to Exploit Stability for 3D Scene Parsing

The goal of this paper is to output a set of 3D bounding boxes and set of dominant planes for a scene depicted in a single image. The key insight is to incorporate stability constraints in the 3D layout, i.e., the reconstructed 3D boxes should not move too far under simulation (in Bullet) with physical forces (gravity, friction). Parameters for 3D boxes are regressed using a modified R-CNN training loss and dominant planes for the walls and floors are regressed via a RNN. A stability criterion is used to update the output 3D scene (via REINFORCE) where the predicted 3D layout is run through Bullet simulator and 3D displacements are checked. Results are shown on synthetic (SUNCG, SceneNet RGB-D) and real (SUN RGB-D) datasets, out-performing the factored 3D approach of [Tulsiani18].

exploit stability, layout, scene parsing, (5 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.53)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.40)

Wang, Renzi, Acerbo, Flavia Sofia, Son, Tong Duy, Patrinos, Panagiotis

Imitation Learning from Observations: An Autoregressive Mixture of Experts Approach

arXiv.org Artificial IntelligenceNov-12-2024

This paper presents a novel approach to imitation learning from observations, where an autoregressive mixture of experts model is deployed to fit the underlying policy. The parameters of the model are learned via a two-stage framework. By leveraging the existing dynamics knowledge, the first stage of the framework estimates the control input sequences and hence reduces the problem complexity. At the second stage, the policy is learned by solving a regularized maximum-likelihood estimation problem using the estimated control input sequences. We further extend the learning procedure by incorporating a Lyapunov stability constraint to ensure asymptotic stability of the identified model, for accurate multi-step predictions. The effectiveness of the proposed framework is validated using two autonomous driving datasets collected from human demonstrations, demonstrating its practical applicability in modelling complex nonlinear dynamics.

artificial intelligence, machine learning, prediction, (18 more...)

2411.08232

Country:

Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.05)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.70)

Industry:

Transportation > Ground > Road (0.49)
Automobiles & Trucks (0.49)
Information Technology (0.35)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)