Vyas, Shubham
Reinforcement Learning for Robust Athletic Intelligence: Lessons from the 2nd 'AI Olympics with RealAIGym' Competition
Wiebe, Felix, Turcato, Niccolò, Libera, Alberto Dalla, Choe, Jean Seong Bjorn, Choi, Bumkyu, Faust, Tim Lukas, Maraqten, Habib, Aghadavoodi, Erfan, Cali, Marco, Sinigaglia, Alberto, Giacomuzzo, Giulio, Romeres, Diego, Kim, Jong-kook, Susto, Gian Antonio, Vyas, Shubham, Mronga, Dennis, Belousov, Boris, Peters, Jan, Kirchner, Frank, Kumar, Shivesh
In the field of robotics many different approaches ranging from classical planning over optimal control to reinforcement learning (RL) are developed and borrowed from other fields to achieve reliable control in diverse tasks. In order to get a clear understanding of their individual strengths and weaknesses and their applicability in real world robotic scenarios is it important to benchmark and compare their performances not only in a simulation but also on real hardware. The '2nd AI Olympics with RealAIGym' competition was held at the IROS 2024 conference to contribute to this cause and evaluate different controllers according to their ability to solve a dynamic control problem on an underactuated double pendulum system with chaotic dynamics. This paper describes the four different RL methods submitted by the participating teams, presents their performance in the swing-up task on a real double pendulum, measured against various criteria, and discusses their transferability from simulation to real hardware and their robustness to external disturbances.
Benchmarking Different QP Formulations and Solvers for Dynamic Quadrupedal Walking
Stark, Franek, Middelberg, Jakob, Mronga, Dennis, Vyas, Shubham, Kirchner, Frank
Quadratic Programs (QPs) are widely used in the control of walking robots, especially in Model Predictive Control (MPC) and Whole-Body Control (WBC). In both cases, the controller design requires the formulation of a QP and the selection of a suitable QP solver, both requiring considerable time and expertise. While computational performance benchmarks exist for QP solvers, studies comparing optimal combinations of computational hardware (HW), QP formulation, and solver performance are lacking. In this work, we compare dense and sparse QP formulations, and multiple solving methods on different HW architectures, focusing on their computational efficiency in dynamic walking of four legged robots using MPC. We introduce the Solve Frequency per Watt (SFPW) as a performance measure to enable a cross hardware comparison of the efficiency of QP solvers. We also benchmark different QP solvers for WBC that we use for trajectory stabilization in quadrupedal walking. As a result, this paper provides recommendations for the selection of QP formulations and solvers for different HW architectures in walking robots and indicates which problems should be devoted the greater technical effort in this domain in future.
AUV trajectory optimization with hydrodynamic forces for Icy Moon Exploration
Rust, Lukas, Vyas, Shubham, Wehbe, Bilal
To explore oceans on ice-covered moons in the solar system, energy-efficient Autonomous Underwater Vehicles (AUVs) with long ranges must cover enough distance to record and collect enough data. These usually underactuated vehicles are hard to control when performing tasks such as vertical docking or the inspection of vertical walls. This paper introduces a control strategy for DeepLeng to navigate in the ice-covered ocean of Jupiter's moon Europa and presents simulation results preceding a discussion on what is further needed for robust control during the mission.
Linear Model Predictive Control for a planar free-floating platform: A comparison of binary input constraint formulations
Stark, Franek, Vyas, Shubham, Schildbach, Georg, Kirchner, Frank
This work develops a first Model Predictive Control for European Space Agencies 3-dof free-floating platform. The challenges of the platform are the on/off thrusters, which cannot be actuated continuously and which are subject to certain timing constraints. This work compares penalty-term, Linear Complementarity Constraints, and classical Mixed Integer formulations in order to develop a controller that natively handles binary inputs. Furthermore, linear constraints are proposed which enforce the timing constraints. Only the Mixed Integer formulation turns out to work sufficiently. Hence, this work develops a new Mixed Integer MPC on the decoupled model of the platform. Feasibility analysis and simulation results show that for a short enough prediction horizon, this controller can (sub)optimally stabilize and control the system under consideration of the constraints in real-time.
AcroMonk: A Minimalist Underactuated Brachiating Robot
Javadi, Mahdi, Harnack, Daniel, Stocco, Paula, Kumar, Shivesh, Vyas, Shubham, Pizzutilo, Daniel, Kirchner, Frank
Brachiation is a dynamic, coordinated swinging maneuver of body and arms used by monkeys and apes to move between branches. As a unique underactuated mode of locomotion, it is interesting to study from a robotics perspective since it can broaden the deployment scenarios for humanoids and animaloids. While several brachiating robots of varying complexity have been proposed in the past, this paper presents the simplest possible prototype of a brachiation robot, using only a single actuator and unactuated grippers. The novel passive gripper design allows it to snap on and release from monkey bars, while guaranteeing well defined start and end poses of the swing. The brachiation behavior is realized in three different ways, using trajectory optimization via direct collocation and stabilization by a model-based time-varying linear quadratic regulator (TVLQR) or model-free proportional derivative (PD) control, as well as by a reinforcement learning (RL) based control policy. The three control schemes are compared in terms of robustness to disturbances, mass uncertainty, and energy consumption. The system design and controllers have been open-sourced. Due to its minimal and open design, the system can serve as a canonical underactuated platform for education and research.
Finding and Following Optimal Trajectories for an Overactuated Floating Robotic Platform
Bredenbeck, Anton, Vyas, Shubham, Suter, Willem, Zwick, Martin, Borrmann, Dorit, Olivares-Mendez, Miguel, Nüchter, Andreas
The recent increase in yearly spacecraft launches and the high number of planned launches have raised questions about maintaining accessibility to space for all interested parties. A key to sustaining the future of space-flight is the ability to service malfunctioning - and actively remove dysfunctional spacecraft from orbit. Robotic platforms that autonomously perform these tasks are a topic of ongoing research and thus must undergo thorough testing before launch. For representative system-level testing, the European Space Agency (ESA) uses, among other things, the Orbital Robotics and GNC Lab (ORGL), a flat-floor facility where air-bearing based platforms exhibit free-floating behavior in three Degrees of Freedom (DoF). This work introduces a representative simulation of a free-floating platform in the testing environment and a software framework for controller development. Finally, this work proposes a controller within that framework for finding and following optimal trajectories between arbitrary states, which is evaluated in simulation and reality.