AITopics | control problem

Collaborating Authors

control problem

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Probabilistic inverse optimal control for non-linear partially observable systems disentangles perceptual uncertainty and behavioral costs

Neural Information Processing SystemsFeb-8-2026, 06:45:06 GMT

Inverse optimal control can be used to characterize behavior in sequential decision-making tasks.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.05)
North America > United States > California > San Diego County > San Diego (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)
(4 more...)

Add feedback

Tight Rates for Bandit Control Beyond Quadratics

Neural Information Processing SystemsDec-25-2025, 18:07:31 GMT

Unlike classical control theory, such as Linear Quadratic Control (LQC), real-world control problems are highly complex. These problems often involve adversarial perturbations, bandit feedback models, and non-quadratic, adversarially chosen cost functions. A fundamental yet unresolved question is whether optimal regret can be achieved for these general control problems. The standard approach to addressing this problem involves a reduction to bandit convex optimization with memory. In the bandit setting, constructing a gradient estimator with low variance is challenging due to the memory structure and non-quadratic loss functions.In this paper, we provide an affirmative answer to this question. Our main contribution is an algorithm that achieves an $\tilde{O}(\sqrt{T})$ optimal regret for bandit non-stochastic control with strongly-convex and smooth cost functions in the presence of adversarial perturbations, improving the previously known $\tilde{O}(T^{2/3})$ regret bound from \citep{cassel2020bandit}. Our algorithm overcomes the memory issue by reducing the problem to Bandit Convex Optimization (BCO) without memory and addresses general strongly-convex costs using recent advancements in BCO from \citep{suggala2024second}. Along the way, we develop an improved algorithm for BCO with memory, which may be of independent interest.

artificial intelligence, machine learning, neural information processing system 37, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.76)

Add feedback

Neural Lyapunov Control

Neural Information Processing SystemsDec-25-2025, 03:51:36 GMT

We propose new methods for learning control policies and neural network Lyapunov functions for nonlinear control problems, with provable guarantee of stability. The framework consists of a learner that attempts to find the control and Lyapunov functions, and a falsifier that finds counterexamples to quickly guide the learner towards solutions. The procedure terminates when no counterexample is found by the falsifier, in which case the controlled nonlinear system is provably stable. The approach significantly simplifies the process of Lyapunov control design, provides end-to-end correctness guarantee, and can obtain much larger regions of attraction than existing methods such as LQR and SOS/SDP. We show experiments on how the new methods obtain high-quality solutions for challenging robot control problems such as path tracking for wheeled vehicles and humanoid robot balancing.

artificial intelligence, name change, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Unifying Entropy Regularization in Optimal Control: From and Back to Classical Objectives via Iterated Soft Policies and Path Integral Solutions

Bhole, Ajinkya, Filabadi, Mohammad Mahmoudi, Crevecoeur, Guillaume, Lefebvre, Tom

arXiv.org Artificial IntelligenceDec-10-2025

This paper develops a unified perspective on several stochastic optimal control formulations through the lens of Kullback-Leibler regularization. We propose a central problem that separates the KL penalties on policies and transitions, assigning them independent weights, thereby generalizing the standard trajectory-level KL-regularization commonly used in probabilistic and KL-regularized control. This generalized formulation acts as a generative structure allowing to recover various control problems. These include the classical Stochastic Optimal Control (SOC), Risk-Sensitive Optimal Control (RSOC), and their policy-based KL-regularized counterparts. The latter we refer to as soft-policy SOC and RSOC, facilitating alternative problems with tractable solutions. Beyond serving as regularized variants, we show that these soft-policy formulations majorize the original SOC and RSOC problem. This means that the regularized solution can be iterated to retrieve the original solution. Furthermore, we identify a structurally synchronized case of the risk-seeking soft-policy RSOC formulation, wherein the policy and transition KL-regularization weights coincide. Remarkably, this specific setting gives rise to several powerful properties such as a linear Bellman equation, path integral solution, and, compositionality, thereby extending these computationally favourable properties to a broad class of control problems.

artificial intelligence, formulation, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2512.06109

Country: Europe > Belgium > Flanders > East Flanders > Ghent (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Control Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)

Add feedback

Fault Tolerant Control of Mecanum Wheeled Mobile Robots

Ma, Xuehui, Zhang, Shiliang, Sun, Zhiyong

arXiv.org Artificial IntelligenceDec-9-2025

Mecanum wheeled mobile robots (MWMRs) are highly susceptible to actuator faults that degrade performance and risk mission failure. Current fault tolerant control (FTC) schemes for MWMRs target complete actuator failures like motor stall, ignoring partial faults e.g., in torque degradation. We propose an FTC strategy handling both fault types, where we adopt posterior probability to learn real-time fault parameters. We derive the FTC law by aggregating probability-weighed control laws corresponding to predefined faults. This ensures the robustness and safety of MWMR control despite varying levels of fault occurrence. Simulation results demonstrate the effectiveness of our FTC under diverse scenarios.

artificial intelligence, dyna, mobile robot, (15 more...)

arXiv.org Artificial Intelligence

2512.06444

Country:

Europe > Norway > Eastern Norway > Oslo (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Robots > Locomotion (0.64)

Add feedback

Global Convergence of Policy Gradient for Entropy Regularized Linear-Quadratic Control with Multiplicative Noise

Diaz, Gabriel, Li, Lucky, Zhang, Wenhao

arXiv.org Artificial IntelligenceDec-2-2025

Reinforcement Learning (RL) has emerged as a powerful framework for sequential decision-making in dynamic environments, particularly when system parameters are unknown. This paper investigates RL-based control for entropy-regularized linear-quadratic (LQ) control problems with multiplicative noise over an infinite time horizon. First, we adapt the regularized policy gradient (RPG) algorithm to stochastic optimal control settings, proving that despite the non-convexity of the problem, RPG converges globally under conditions of gradient domination and almost-smoothness. Second, based on zero-order optimization approach, we introduce a novel model free RL algorithm: Sample-based regularized policy gradient (SB-RPG). SB-RPG operates without knowledge of system parameters yet still retains strong theoretical guarantees of global convergence. Our model leverages entropy regularization to address the exploration versus exploitation trade-off inherent in RL. Numerical simulations validate the theoretical results and demonstrate the efficiency of SB-RPG in unknown-parameters environments.

lemma 4, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2510.02896

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Asia > China > Hong Kong (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Multistage Campaigning in Social Networks

Mehrdad Farajtabar, Xiaojing Ye, Sahar Harati, Le Song, Hongyuan Zha

Neural Information Processing SystemsNov-21-2025, 09:03:18 GMT

We consider the problem of how to optimize multi-stage campaigning over social networks. The dynamic programming framework is employed to balance the high present reward and large penalty on low future outcome in the presence of extensive uncertainties. In particular, we establish theoretical foundations of optimal campaigning over social networks where the user activities are modeled as a multivariate Hawkes process, and we derive a time dependent linear relation between the intensity of exogenous events and several commonly used objective functions of campaigning. We further develop a convex dynamic programming framework for determining the optimal intervention policy that prescribes the required level of external drive at each stage for the desired campaigning result. Experiments on both synthetic data and the real-world MemeTracker dataset show that our algorithm can steer the user activities for optimal campaigning much more accurately than baselines.

artificial intelligence, exposure, social media, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Industry:

Government (0.93)
Information Technology > Services (0.82)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.55)

Add feedback

Policy-Conditioned Uncertainty Sets for Robust Markov Decision Processes Andrea Tirinzoni Politecnico di Milano andrea.tirinzoni@polimi.it Xiangli Chen Amazon Robotics

Neural Information Processing SystemsNov-20-2025, 17:42:58 GMT

What policy should be employed in a Markov decision process with uncertain parameters? Robust optimization's answer to this question is to use rectangular

constraint, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > New Hampshire (0.04)
North America > United States > Massachusetts (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(4 more...)

Industry: Information Technology > Robotics & Automation (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.71)
(2 more...)

Add feedback

836cf992a71f7a0bda218c180f942902-Paper-Conference.pdf

Neural Information Processing SystemsNov-19-2025, 19:18:54 GMT

artificial intelligence, exp, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
Europe > Germany > Saxony > Leipzig (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Policy-Conditioned Uncertainty Sets for Robust Markov Decision Processes Andrea Tirinzoni Politecnico di Milano andrea.tirinzoni@polimi.it Xiangli Chen Amazon Robotics

Neural Information Processing SystemsNov-18-2025, 01:53:57 GMT

What policy should be employed in a Markov decision process with uncertain parameters? Robust optimization's answer to this question is to use rectangular

constraint, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > New Hampshire (0.04)
North America > United States > Massachusetts (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(4 more...)

Industry: Information Technology > Robotics & Automation (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.71)
(2 more...)

Add feedback