control frequency
216f44e2d28d4e175a194492bde9148f-Paper.pdf
We assume the environment modeled as discrete-time factored-action MDP (FA-MDP)M = hS,A,P,R,γi where S is the set of states s, A is the set of vector-represented actionsa = (a1,...,am),P(s0|s,a) = Pr(st+1 = s0|st = s,at = a)isthe transition probability,R(s,a) R is the immediate reward for taking actiona in state s, and γ [0,1) is the discount factor.
Reinforcement Learning for Control with Multiple Frequencies
Many real-world sequential decision problems involve multiple action variables whose control frequencies are different, such that actions take their effects at different periods. While these problems can be formulated with the notion of multiple action persistences in factored-action MDP (FA-MDP), it is non-trivial to solve them efficiently since an action-persistent policy constructed from a stationary policy can be arbitrarily suboptimal, rendering solution methods for the standard FA-MDPs hardly applicable. In this paper, we formalize the problem of multiple control frequencies in RL and provide its efficient solution method. Our proposed method, Action-Persistent Policy Iteration (AP-PI), provides a theoretical guarantee on the convergence to an optimal solution while incurring only a factor of $|A|$ increase in time complexity during policy improvement step, compared to the standard policy iteration for FA-MDPs.
Variable-Impedance Muscle Coordination under Slow-Rate Control Frequencies and Limited Observation Conditions Evaluated through Legged Locomotion
Asai, Hidaka, Noda, Tomoyuki, Morimoto, Jun
Human motor control remains agile and robust despite limited sensory information for feedback, a property attributed to the body's ability to perform morphological computation through muscle coordination with variable impedance. However, it remains unclear how such low-level mechanical computation reduces the control requirements of the high-level controller. In this study, we implement a hierarchical controller consisting of a high-level neural network trained by reinforcement learning and a low-level variable-impedance muscle coor dination model with mono- and biarticular muscles in monoped locomotion task. We systematically restrict the high-level controller by varying the control frequency and by introducing biologically inspired observation conditions: delayed, partial, and substituted observation. Under these conditions, we evaluate how the low-level variable-impedance muscle coordination contributes to learning process of high-level neural network. The results show that variable-impedance muscle coordination enables stable locomotion even under slow-rate control frequency and limited observation conditions. These findings demonstrate that the morphological computation of muscle coordination effectively offloads high-frequency feedback of the high-level controller and provide a design principle for the controller in motor control.
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.05)
- North America > United States > California > San Diego County > San Diego (0.04)
- Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
- Health & Medicine (1.00)
- Education > Educational Setting (0.46)
TARC: Time-Adaptive Robotic Control
Sukhija, Arnav, Treven, Lenart, Cheng, Jin, Dörfler, Florian, Coros, Stelian, Krause, Andreas
Fixed-frequency control in robotics imposes a trade-off between the efficiency of low-frequency control and the robustness of high-frequency control, a limitation not seen in adaptable biological systems. We address this with a reinforcement learning approach in which policies jointly select control actions and their application durations, enabling robots to autonomously modulate their control frequency in response to situational demands. We validate our method with zero-shot sim-to-real experiments on two distinct hardware platforms: a high-speed RC car and a quadrupedal robot. Our method matches or outperforms fixed-frequency baselines in terms of rewards while significantly reducing the control frequency and exhibiting adaptive frequency control under real-world conditions.
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- North America > Puerto Rico > San Juan > San Juan (0.04)
- North America > Canada (0.04)
Heterogeneous object manipulation on nonlinear soft surface through linear controller
Ingle, Pratik, Støy, Kasper, Faiña, Andres
Manipulation surfaces indirectly control and reposition objects by actively modifying their shape or properties rather than directly gripping objects. These surfaces, equipped with dense actuator arrays, generate dynamic deformations. However, a high-density actuator array introduces considerable complexity due to increased degrees of freedom (DOF), complicating control tasks. High DOF restrict the implementation and utilization of manipulation surfaces in real-world applications as the maintenance and control of such systems exponentially increase with array/surface size. Learning-based control approaches may ease the control complexity, but they require extensive training samples and struggle to generalize for heterogeneous objects. In this study, we introduce a simple, precise and robust PID-based linear close-loop feedback control strategy for heterogeneous object manipulation on MANTA-RAY (Manipulation with Adaptive Non-rigid Textile Actuation with Reduced Actuation density). Our approach employs a geometric transformation-driven PID controller, directly mapping tilt angle control outputs(1D/2D) to actuator commands to eliminate the need for extensive black-box training. We validate the proposed method through simulations and experiments on a physical system, successfully manipulating objects with diverse geometries, weights and textures, including fragile objects like eggs and apples. The outcomes demonstrate that our approach is highly generalized and offers a practical and reliable solution for object manipulation on soft robotic manipulation, facilitating real-world implementation without prohibitive training demands.
CHEQ-ing the Box: Safe Variable Impedance Learning for Robotic Polishing
Cramer, Emma, Jäschke, Lukas, Trimpe, Sebastian
Robotic systems are increasingly employed for industrial automation, with contact-rich tasks like polishing requiring dexterity and compliant behaviour. These tasks are difficult to model, making classical control challenging. Deep reinforcement learning (RL) offers a promising solution by enabling the learning of models and control policies directly from data. However, its application to real-world problems is limited by data inefficiency and unsafe exploration. Adaptive hybrid RL methods blend classical control and RL adaptively, combining the strengths of both: structure from control and learning from RL. This has led to improvements in data efficiency and exploration safety. However, their potential for hardware applications remains underexplored, with no evaluations on physical systems to date. Such evaluations are critical to fully assess the practicality and effectiveness of these methods in real-world settings. This work presents an experimental demonstration of the hybrid RL algorithm CHEQ for robotic polishing with variable impedance, a task requiring precise force and velocity tracking. In simulation, we show that variable impedance enhances polishing performance. We compare standalone RL with adaptive hybrid RL, demonstrating that CHEQ achieves effective learning while adhering to safety constraints. On hardware, CHEQ achieves effective polishing behaviour, requiring only eight hours of training and incurring just five failures. These results highlight the potential of adaptive hybrid RL for real-world, contact-rich tasks trained directly on hardware.
- Europe > Germany (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Coordinated Power Smoothing Control for Wind Storage Integrated System with Physics-informed Deep Reinforcement Learning
Wang, Shuyi, Zhao, Huan, Cao, Yuji, Pan, Zibin, Liu, Guolong, Liang, Gaoqi, Zhao, Junhua
However, the intermittent nature of wind power introduces inherent variability and uncertainty when integrated into power systems. As the wind power penetration level increases, the secure and reliable operation of power systems becomes a significant challenge [1]. In practice, the grid usually requires the active power fluctuation from wind farms to be confined to a specific value within a one-minute time window [2]. Therefore, Wind Power smoothing control (PSC) has emerged as a potential solution. Previous research has established two major categories of Power Smoothing Control for wind farms, including regulation control of wind turbines and indirect power control by Battery Energy Storage System (BESS). The former approach typically involves pitch angle control [3], rotor inertia control [4], and Direct Current (DC)-link voltage control [5], which require a different operation from maximum power point tracking, causing inefficiency and potential damages [6]. On the contrary, with a stronger capability of power smoothing, the BESS-based PSC coordinates the active power from BESS and wind turbine [7], providing rapid response to power fluctuation with high operability and little power loss. Recognizing the benefits of such Wind Storage Integrated Systems (WSIS) [8], incentive policies have been introduced to mandate the installation of BESSs from 10% to 30% of wind farms' installed capacity. WSIS facilitates wind power storage, allocating, and smoothing, enhancing delivery stability and energy management flexibility for both the grid and wind farm.