final position
Equilibrium Propagation for Learning in Lagrangian Dynamical Systems
We propose a method for training dynamical systems governed by Lagrangian mechanics using Equilibrium Propagation. Our approach extends Equilibrium Propagation - initially developed for energy-based models - to dynamical trajectories by leveraging the principle of action extremization. Training is achieved by gently nudging trajectories toward desired targets and measuring how the variables conjugate to the parameters to be trained respond. This method is particularly suited to systems with periodic boundary conditions or fixed initial and final states, enabling efficient parameter updates without requiring explicit backpropagation through time. In the case of periodic boundary conditions, this approach yields the semiclassical limit of Quantum Equilibrium Propagation. Applications to systems with dissipation are also discussed.
Universal Trajectory Optimization Framework for Differential-Driven Robot Class
Zhang, Mengke, Han, Zhichao, Xu, Chao, Gao, Fei, Cao, Yanjun
Differential-driven robots are widely used in various scenarios thanks to their straightforward principle, from household service robots to disaster response field robots. There are several different types of deriving mechanisms considering the real-world applications, including two-wheeled, four-wheeled skid-steering, tracked robots, etc. The differences in the driving mechanism usually require specific kinematic modeling when precise controlling is desired. Furthermore, the nonholonomic dynamics and possible lateral slip lead to different degrees of difficulty in getting feasible and high-quality trajectories. Therefore, a comprehensive trajectory optimization framework to compute trajectories efficiently for various kinds of differential-driven robots is highly desirable. In this paper, we propose a universal trajectory optimization framework that can be applied to differential-driven robot class, enabling the generation of high-quality trajectories within a restricted computational timeframe. We introduce a novel trajectory representation based on polynomial parameterization of motion states or their integrals, such as angular and linear velocities, that inherently matching robots' motion to the control principle for differential-driven robot class. The trajectory optimization problem is formulated to minimize complexity while prioritizing safety and operational efficiency. We then build a full-stack autonomous planning and control system to show the feasibility and robustness. We conduct extensive simulations and real-world testing in crowded environments with three kinds of differential-driven robots to validate the effectiveness of our approach. We will release our method as an open-source package.
- Automobiles & Trucks (0.46)
- Energy (0.46)
A Motion Planning Algorithm in a Figure Eight Track
Jardon, Cristian, Sheppard, Brian, Zaveri, Veet
We design a motion planning algorithm to coordinate the movements of two robots along a figure eight track, in such a way that no collisions occur. We use a topological approach to robot motion planning that relates instabilities in motion planning algorithms to topological features of configuration spaces. The topological complexity of a configuration space is an invariant that measures the complexity of motion planning algorithms. We show that the topological complexity of our problem is 3 and construct an explicit algorithm with three continuous instructions.
Rafting Towards Consensus: Formation Control of Distributed Dynamical Systems
Tariverdi, Abbas, Torresen, Jim
In this paper, we introduce a novel adaptation of the Raft consensus algorithm for achieving emergent formation control in multi-agent systems with a single integrator dynamics. This strategy, dubbed "Rafting," enables robust cooperation between distributed nodes, thereby facilitating the achievement of desired geometric configurations. Our framework takes advantage of the Raft algorithm's inherent fault tolerance and strong consistency guarantees to extend its applicability to distributed formation control tasks. Following the introduction of a decentralized mechanism for aggregating agent states, a synchronization protocol for information exchange and consensus formation is proposed. The Raft consensus algorithm combines leader election, log replication, and state machine application to steer agents toward a common, collaborative goal. A series of detailed simulations validate the efficacy and robustness of our method under various conditions, including partial network failures and disturbances. The outcomes demonstrate the algorithm's potential and open up new possibilities in swarm robotics, autonomous transportation, and distributed computation. The implementation of the algorithms presented in this paper is available at https://github.com/abbas-tari/raft.git.
- Europe > Norway > Norwegian Sea (0.24)
- Europe > Norway > Eastern Norway > Oslo (0.05)
- North America > United States > California > Los Angeles County > Santa Monica (0.04)
- Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
- Energy (0.69)
- Transportation (0.66)
- Government > Voting & Elections (0.46)
A Robust Open-source Tendon-driven Robot Arm for Learning Control of Dynamic Motions
Guist, Simon, Schneider, Jan, Ma, Hao, Berenz, Vincent, Martus, Julian, Grüninger, Felix, Mühlebach, Michael, Fiene, Jonathan, Schölkopf, Bernhard, Büchler, Dieter
A long-lasting goal of robotics research is to operate robots safely, while achieving high performance which often involves fast motions. Traditional motor-driven systems frequently struggle to balance these competing demands. Addressing this trade-off is crucial for advancing fields such as manufacturing and healthcare, where seamless collaboration between robots and humans is essential. We introduce a four degree-of-freedom (DoF) tendon-driven robot arm, powered by pneumatic artificial muscles (PAMs), to tackle this challenge. Our new design features low friction, passive compliance, and inherent impact resilience, enabling rapid, precise, high-force, and safe interactions during dynamic tasks. In addition to fostering safer human-robot collaboration, the inherent safety properties are particularly beneficial for reinforcement learning, where the robot's ability to explore dynamic motions without causing self-damage is crucial. We validate our robotic arm through various experiments, including long-term dynamic motions, impact resilience tests, and assessments of its ease of control. On a challenging dynamic table tennis task, we further demonstrate our robot's capabilities in rapid and precise movements. By showcasing our new design's potential, we aim to inspire further research on robotic systems that balance high performance and safety in diverse tasks. Our open-source hardware design, software, and a large dataset of diverse robot motions can be found at https://webdav.tuebingen.mpg.de/pamy2/.
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.34)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
Multi-Robot Motion Planning for Unit Discs with Revolving Areas
Agarwal, Pankaj K., Geft, Tzvika, Halperin, Dan, Taylor, Erin
We study the problem of motion planning for a collection of $n$ labeled unit disc robots in a polygonal environment. We assume that the robots have revolving areas around their start and final positions: that each start and each final is contained in a radius $2$ disc lying in the free space, not necessarily concentric with the start or final position, which is free from other start or final positions. This assumption allows a weakly-monotone motion plan, in which robots move according to an ordering as follows: during the turn of a robot $R$ in the ordering, it moves fully from its start to final position, while other robots do not leave their revolving areas. As $R$ passes through a revolving area, a robot $R'$ that is inside this area may move within the revolving area to avoid a collision. Notwithstanding the existence of a motion plan, we show that minimizing the total traveled distance in this setting, specifically even when the motion plan is restricted to be weakly-monotone, is APX-hard, ruling out any polynomial-time $(1+\epsilon)$-approximation algorithm. On the positive side, we present the first constant-factor approximation algorithm for computing a feasible weakly-monotone motion plan. The total distance traveled by the robots is within an $O(1)$ factor of that of the optimal motion plan, which need not be weakly monotone. Our algorithm extends to an online setting in which the polygonal environment is fixed but the initial and final positions of robots are specified in an online manner. Finally, we observe that the overhead in the overall cost that we add while editing the paths to avoid robot-robot collision can vary significantly depending on the ordering we chose. Finding the best ordering in this respect is known to be NP-hard, and we provide a polynomial time $O(\log n \log \log n)$-approximation algorithm for this problem.
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
- North America > United States (0.04)
- (2 more...)
Intelligent Trajectory Design for RIS-NOMA aided Multi-robot Communications
Gao, Xinyu, Mu, Xidong, Yi, Wenqiang, Liu, Yuanwei
A novel reconfigurable intelligent surface-aided multi-robot network is proposed, where multiple mobile robots are served by an access point (AP) through non-orthogonal multiple access (NOMA). The goal is to maximize the sum-rate of whole trajectories for the multi-robot system by jointly optimizing trajectories and NOMA decoding orders of robots, phase-shift coefficients of the RIS, and the power allocation of the AP, subject to predicted initial and final positions of robots and the quality of service (QoS) of each robot. To tackle this problem, an integrated machine learning (ML) scheme is proposed, which combines long short-term memory (LSTM)-autoregressive integrated moving average (ARIMA) model and dueling double deep Q-network (D$^{3}$QN) algorithm. For initial and final position prediction for robots, the LSTM-ARIMA is able to overcome the problem of gradient vanishment of non-stationary and non-linear sequences of data. For jointly determining the phase shift matrix and robots' trajectories, D$^{3}$QN is invoked for solving the problem of action value overestimation. Based on the proposed scheme, each robot holds an optimal trajectory based on the maximum sum-rate of a whole trajectory, which reveals that robots pursue long-term benefits for whole trajectory design. Numerical results demonstrated that: 1) LSTM-ARIMA model provides high accuracy predicting model; 2) The proposed D$^{3}$QN algorithm can achieve fast average convergence; and 3) RIS-NOMA networks have superior network performance compared to RIS-aided orthogonal counterparts.
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.68)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > New York (0.04)
- (2 more...)
Show me what you want: Inverse reinforcement learning to automatically design robot swarms by demonstration
Gharbi, Ilyes, Kuckling, Jonas, Ramos, David Garzón, Birattari, Mauro
Automatic design is a promising approach to generating control software for robot swarms. So far, automatic design has relied on mission-specific objective functions to specify the desired collective behavior. In this paper, we explore the possibility to specify the desired collective behavior via demonstrations. We develop Demo-Cho, an automatic design method that combines inverse reinforcement learning with automatic modular design of control software for robot swarms. We show that, only on the basis of demonstrations and without the need to be provided with an explicit objective function, Demo-Cho successfully generated control software to perform four missions. We present results obtained in simulation and with physical robots.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.05)
- North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
- (6 more...)
Reinforcement Learning For Constraint Satisfaction Game Agents (15-Puzzle, Minesweeper, 2048, and Sudoku)
In recent years, reinforcement learning has seen interest because of deep Q-Learning, where the model is a convolutional neural network. Deep Q-Learning has shown promising results in games such as Atari and AlphaGo. Instead of learning the entire Q-table, it learns an estimate of the Q function that determines a state's policy action. We use Q-Learning and deep Q-learning, to learn control policies of four constraint satisfaction games (15-Puzzle, Minesweeper, 2048, and Sudoku). 15-Puzzle is a sliding permutation puzzle and provides a challenge in addressing its large state space. Minesweeper and Sudoku involve partially observable states and guessing. 2048 is also a sliding puzzle but allows for easier state representation (compared to 15-Puzzle) and uses interesting reward shaping to solve the game. These games offer unique insights into the potential and limits of reinforcement learning. The Q agent is trained with no rules of the game, with only the reward corresponding to each state's action. Our unique contribution is in choosing the reward structure, state representation, and formulation of the deep neural network. For low shuffle, 15-Puzzle, achieves a 100% win rate, the medium and high shuffle achieve about 43% and 22% win rates respectively. On a standard 16x16 Minesweeper board, both low and high-density boards achieve close to 45% win rate, whereas medium density boards have a low win rate of 15%. For 2048, the 1024 win rate was achieved with significant ease (100%) with high win rates for 2048, 4096, 8192 and 16384 as 40%, 0.05%, 0.01% and 0.004% , respectively. The easy Sudoku games had a win rate of 7%, while medium and hard games had 2.1% and 1.2% win rates, respectively. This paper explores the environment complexity and behavior of a subset of constraint games using reward structures which can get us closer to understanding how humans learn.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > Santa Clara County > Cupertino (0.04)
- Europe > Germany > North Rhine-Westphalia > Arnsberg Region > Dortmund (0.04)
- (2 more...)
- Leisure & Entertainment > Games > Sudoku (1.00)
- Leisure & Entertainment > Games > 15 Puzzle (1.00)
- Government > Military > Navy (1.00)
Model-Based Quality-Diversity Search for Efficient Robot Learning
Keller, Leon, Tanneberg, Daniel, Stark, Svenja, Peters, Jan
Despite recent progress in robot learning, it still remains a challenge to program a robot to deal with open-ended object manipulation tasks. One approach that was recently used to autonomously generate a repertoire of diverse skills is a novelty based Quality-Diversity~(QD) algorithm. However, as most evolutionary algorithms, QD suffers from sample-inefficiency and, thus, it is challenging to apply it in real-world scenarios. This paper tackles this problem by integrating a neural network that predicts the behavior of the perturbed parameters into a novelty based QD algorithm. In the proposed Model-based Quality-Diversity search (M-QD), the network is trained concurrently to the repertoire and is used to avoid executing unpromising actions in the novelty search process. Furthermore, it is used to adapt the skills of the final repertoire in order to generalize the skills to different scenarios. Our experiments show that enhancing a QD algorithm with such a forward model improves the sample-efficiency and performance of the evolutionary process and the skill adaptation.