AITopics

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)

Neural Information Processing SystemsDec-23-2025, 20:29:27 GMT

Reinforcement Learning for Control with Multiple Frequencies

Many real-world sequential decision problems involve multiple action variables whose control frequencies are different, such that actions take their effects at different periods. While these problems can be formulated with the notion of multiple action persistences in factored-action MDP (FA-MDP), it is non-trivial to solve them efficiently since an action-persistent policy constructed from a stationary policy can be arbitrarily suboptimal, rendering solution methods for the standard FA-MDPs hardly applicable. In this paper, we formalize the problem of multiple control frequencies in RL and provide its efficient solution method. Our proposed method, Action-Persistent Policy Iteration (AP-PI), provides a theoretical guarantee on the convergence to an optimal solution while incurring only a factor of $|A|$ increase in time complexity during policy improvement step, compared to the standard policy iteration for FA-MDPs.

multiple frequency, name change, reinforcement learning, (8 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.40)
Information Technology > Artificial Intelligence > Machine Learning (0.37)

Asai, Hidaka, Noda, Tomoyuki, Morimoto, Jun

Variable-Impedance Muscle Coordination under Slow-Rate Control Frequencies and Limited Observation Conditions Evaluated through Legged Locomotion

arXiv.org Artificial IntelligenceDec-4-2025

Human motor control remains agile and robust despite limited sensory information for feedback, a property attributed to the body's ability to perform morphological computation through muscle coordination with variable impedance. However, it remains unclear how such low-level mechanical computation reduces the control requirements of the high-level controller. In this study, we implement a hierarchical controller consisting of a high-level neural network trained by reinforcement learning and a low-level variable-impedance muscle coor dination model with mono- and biarticular muscles in monoped locomotion task. We systematically restrict the high-level controller by varying the control frequency and by introducing biologically inspired observation conditions: delayed, partial, and substituted observation. Under these conditions, we evaluate how the low-level variable-impedance muscle coordination contributes to learning process of high-level neural network. The results show that variable-impedance muscle coordination enables stable locomotion even under slow-rate control frequency and limited observation conditions. These findings demonstrate that the morphological computation of muscle coordination effectively offloads high-frequency feedback of the high-level controller and provide a design principle for the controller in motor control.

artificial intelligence, controller, machine learning, (17 more...)

2512.03459

Country:

North America > United States (0.28)
Asia > Japan > Honshū > Kansai (0.15)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (1.00)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.55)

Neural Information Processing SystemsNov-13-2025, 23:42:55 GMT

A Appendix

Confined trust regions are a stable way of making large updates and avoiding pessimistic coefficients.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

arXiv.org Artificial IntelligenceOct-28-2025

TARC: Time-Adaptive Robotic Control

Sukhija, Arnav, Treven, Lenart, Cheng, Jin, Dörfler, Florian, Coros, Stelian, Krause, Andreas

Fixed-frequency control in robotics imposes a trade-off between the efficiency of low-frequency control and the robustness of high-frequency control, a limitation not seen in adaptable biological systems. We address this with a reinforcement learning approach in which policies jointly select control actions and their application durations, enabling robots to autonomously modulate their control frequency in response to situational demands. We validate our method with zero-shot sim-to-real experiments on two distinct hardware platforms: a high-speed RC car and a quadrupedal robot. Our method matches or outperforms fixed-frequency baselines in terms of rewards while significantly reducing the control frequency and exhibiting adaptive frequency control under real-world conditions.

frequency, machine learning, reinforcement learning, (18 more...)

2510.23176

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.83)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Neural Information Processing SystemsOct-9-2025, 15:08:36 GMT

A Appendix

Confined trust regions are a stable way of making large updates and avoiding pessimistic coefficients.

assumption, trajectory, variant, (14 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Neural Information Processing SystemsOct-2-2025, 10:53:35 GMT

Reinforcement Learning for Control with Multiple Frequencies Jongmin Lee 1, Byung-Jun Lee

Finally, we define the c-persistent policy π as follows: Definition 1.

action persistence, machine learning, reinforcement learning, (16 more...)

Country: North America (0.68)

Industry: Transportation (0.30)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Ingle, Pratik, Støy, Kasper, Faiña, Andres

Heterogeneous object manipulation on nonlinear soft surface through linear controller

arXiv.org Artificial IntelligenceJul-22-2025

Manipulation surfaces indirectly control and reposition objects by actively modifying their shape or properties rather than directly gripping objects. These surfaces, equipped with dense actuator arrays, generate dynamic deformations. However, a high-density actuator array introduces considerable complexity due to increased degrees of freedom (DOF), complicating control tasks. High DOF restrict the implementation and utilization of manipulation surfaces in real-world applications as the maintenance and control of such systems exponentially increase with array/surface size. Learning-based control approaches may ease the control complexity, but they require extensive training samples and struggle to generalize for heterogeneous objects. In this study, we introduce a simple, precise and robust PID-based linear close-loop feedback control strategy for heterogeneous object manipulation on MANTA-RAY (Manipulation with Adaptive Non-rigid Textile Actuation with Reduced Actuation density). Our approach employs a geometric transformation-driven PID controller, directly mapping tilt angle control outputs(1D/2D) to actuator commands to eliminate the need for extensive black-box training. We validate the proposed method through simulations and experiments on a physical system, successfully manipulating objects with diverse geometries, weights and textures, including fragile objects like eggs and apples. The outcomes demonstrate that our approach is highly generalized and offers a practical and reliable solution for object manipulation on soft robotic manipulation, facilitating real-world implementation without prohibitive training demands.

actuator, artificial intelligence, controller, (17 more...)

2507.14967

Country: Europe (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Transportation (0.34)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Cramer, Emma, Jäschke, Lukas, Trimpe, Sebastian

CHEQ-ing the Box: Safe Variable Impedance Learning for Robotic Polishing

arXiv.org Artificial IntelligenceJan-14-2025

Robotic systems are increasingly employed for industrial automation, with contact-rich tasks like polishing requiring dexterity and compliant behaviour. These tasks are difficult to model, making classical control challenging. Deep reinforcement learning (RL) offers a promising solution by enabling the learning of models and control policies directly from data. However, its application to real-world problems is limited by data inefficiency and unsafe exploration. Adaptive hybrid RL methods blend classical control and RL adaptively, combining the strengths of both: structure from control and learning from RL. This has led to improvements in data efficiency and exploration safety. However, their potential for hardware applications remains underexplored, with no evaluations on physical systems to date. Such evaluations are critical to fully assess the practicality and effectiveness of these methods in real-world settings. This work presents an experimental demonstration of the hybrid RL algorithm CHEQ for robotic polishing with variable impedance, a task requiring precise force and velocity tracking. In simulation, we show that variable impedance enhances polishing performance. We compare standalone RL with adaptive hybrid RL, demonstrating that CHEQ achieves effective learning while adhering to safety constraints. On hardware, CHEQ achieves effective polishing behaviour, requiring only eight hours of training and incurring just five failures. These results highlight the potential of adaptive hybrid RL for real-world, contact-rich tasks trained directly on hardware.

controller, machine learning, reinforcement learning, (18 more...)

2501.07985

Country:

North America > United States (0.46)
Europe (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceDec-17-2024

Coordinated Power Smoothing Control for Wind Storage Integrated System with Physics-informed Deep Reinforcement Learning

Wang, Shuyi, Zhao, Huan, Cao, Yuji, Pan, Zibin, Liu, Guolong, Liang, Gaoqi, Zhao, Junhua

However, the intermittent nature of wind power introduces inherent variability and uncertainty when integrated into power systems. As the wind power penetration level increases, the secure and reliable operation of power systems becomes a significant challenge [1]. In practice, the grid usually requires the active power fluctuation from wind farms to be confined to a specific value within a one-minute time window [2]. Therefore, Wind Power smoothing control (PSC) has emerged as a potential solution. Previous research has established two major categories of Power Smoothing Control for wind farms, including regulation control of wind turbines and indirect power control by Battery Energy Storage System (BESS). The former approach typically involves pitch angle control [3], rotor inertia control [4], and Direct Current (DC)-link voltage control [5], which require a different operation from maximum power point tracking, causing inefficiency and potential damages [6]. On the contrary, with a stronger capability of power smoothing, the BESS-based PSC coordinates the active power from BESS and wind turbine [7], providing rapid response to power fluctuation with high operability and little power loss. Recognizing the benefits of such Wind Storage Integrated Systems (WSIS) [8], incentive policies have been introduced to mandate the installation of BESSs from 10% to 30% of wind farms' installed capacity. WSIS facilitates wind power storage, allocating, and smoothing, enhancing delivery stability and energy management flexibility for both the grid and wind farm.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

2412.17838

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Hong Kong (0.04)
Europe > Italy > Lazio > Rome (0.04)
(6 more...)

Genre: Research Report > Promising Solution (0.66)

Industry: Energy > Renewable > Wind (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)