Goto

Collaborating Authors

 Planning & Scheduling


Interpreting the Learned Model in MuZero Planning

arXiv.org Artificial Intelligence

MuZero has achieved superhuman performance in various games by using a dynamics network to predict environment dynamics for planning, without relying on simulators. However, the latent states learned by the dynamics network make its planning process opaque. This paper aims to demystify MuZero's model by interpreting the learned latent states. We incorporate observation reconstruction and state consistency into MuZero training and conduct an in-depth analysis to evaluate latent states across two board games: 9x9 Go and Outer-Open Gomoku, and three Atari games: Breakout, Ms. Pacman, and Pong. Our findings reveal that while the dynamics network becomes less accurate over longer simulations, MuZero still performs effectively by using planning to correct errors. Our experiments also show that the dynamics network learns better latent states in board games than in Atari games. These insights contribute to a better understanding of MuZero and offer directions for future research to improve the playing performance, robustness, and interpretability of the MuZero algorithm.


Seeing Through Pixel Motion: Learning Obstacle Avoidance from Optical Flow with One Camera

arXiv.org Artificial Intelligence

Optical flow captures the motion of pixels in an image sequence over time, providing information about movement, depth, and environmental structure. Flying insects utilize this information to navigate and avoid obstacles, allowing them to execute highly agile maneuvers even in complex environments. Despite its potential, autonomous flying robots have yet to fully leverage this motion information to achieve comparable levels of agility and robustness. Challenges of control from optical flow include extracting accurate optical flow at high speeds, handling noisy estimation, and ensuring robust performance in complex environments. To address these challenges, we propose a novel end-to-end system for quadrotor obstacle avoidance using monocular optical flow. We develop an efficient differentiable simulator coupled with a simplified quadrotor model, allowing our policy to be trained directly through first-order gradient optimization. Additionally, we introduce a central flow attention mechanism and an action-guided active sensing strategy that enhances the policy's focus on task-relevant optical flow observations to enable more responsive decision-making during flight. Our system is validated both in simulation and the real world using an FPV racing drone. Despite being trained in a simple environment in simulation, our system is validated both in simulation and the real world using an FPV racing drone. Despite being trained in a simple environment in simulation, our system demonstrates agile and robust flight in various unknown, cluttered environments in the real world at speeds of up to 6m/s.


Planning for quasi-static manipulation tasks via an intrinsic haptic metric

arXiv.org Artificial Intelligence

Contact-rich manipulation often requires strategic interactions with objects, such as pushing to accomplish specific tasks. We propose a novel scenario where a robot inserts a book into a crowded shelf by pushing aside neighboring books to create space before slotting the new book into place. Classical planning algorithms fail in this context due to limited space and their tendency to avoid contact. Additionally, they do not handle indirectly manipulable objects or consider force interactions. Our key contributions are: i) re-framing quasi-static manipulation as a planning problem on an implicit manifold derived from equilibrium conditions; ii) utilizing an intrinsic haptic metric instead of ad-hoc cost functions; and iii) proposing an adaptive algorithm that simultaneously updates robot states, object positions, contact points, and haptic distances. We evaluate our method on such crowded bookshelf insertion task but it is a general formulation to rigid bodies manipulation tasks. We propose proxies to capture contact point and force, with superellipse to represent objects. This simplified model guarantee the differentiablity. Our framework autonomously discovers strategic wedging-in policies while our simplified contact model achieves behavior similar to real world scenarios. We also vary the stiffness and initial positions to analysis our framework comprehensively. The video can be found at https://youtu.be/eab8umZ3AQ0.


How to Drawjectory? -- Trajectory Planning using Programming by Demonstration

arXiv.org Artificial Intelligence

A flight trajectory defines how exactly a quadrocopter moves in the three-dimensional space from one position to another. Automatic flight trajectory planning faces challenges such as high computational effort and a lack of precision. Hence, when low computational effort or precise control is required, programming the flight route trajectory manually might be preferable. However, this requires in-depth knowledge of how to accurately plan flight trajectories in three-dimensional space. We propose planning quadrocopter flight trajectories manually using the Programming by Demonstration (PbD) approach -- simply drawing the trajectory in the three-dimensional space by hand. This simplifies the planning process and reduces the level of in-depth knowledge required. We implemented the approach in the context of the Quadcopter Lab at Ulm University. In order to evaluate our approach, we compare the precision and accuracy of the trajectories drawn by a user using our approach as well as the required time with those manually programmed using a domain specific language. The evaluation shows that the Drawjectory workflow is, on average, 78.7 seconds faster without a significant loss of precision, shown by an average deviation 6.67 cm.


Diversity Progress for Goal Selection in Discriminability-Motivated RL

arXiv.org Artificial Intelligence

Non-uniform goal selection has the potential to improve the reinforcement learning (RL) of skills over uniform-random selection. In this paper, we introduce a method for learning a goal-selection policy in intrinsically-motivated goal-conditioned RL: "Diversity Progress" (DP). The learner forms a curriculum based on observed improvement in discriminability over its set of goals. Our proposed method is applicable to the class of discriminability-motivated agents, where the intrinsic reward is computed as a function of the agent's certainty of following the true goal being pursued. This reward can motivate the agent to learn a set of diverse skills without extrinsic rewards. We demonstrate empirically that a DP-motivated agent can learn a set of distinguishable skills faster than previous approaches, and do so without suffering from a collapse of the goal distribution -- a known issue with some prior approaches. We end with plans to take this proof-of-concept forward.


Britain's green energy pledge 'credible' if planning fixed, says system operator

The Guardian > Energy

A plan to create a clean electricity system by 2030 promised by Labour before the election is "immensely challenging" but still "credible" if ministers take urgent action to fix Britain's sluggish planning system, the energy system operator has said. Britain could become a net exporter of green electricity by the end of the decade at no extra costs to the energy system under the plans and bills may even fall if ministers make the right policy changes, according to the operator. The newly formed National Energy System Operator (Neso) put forward the conclusions as part of its official advice to new ministers on how to reach Labour election pledge to decarbonise the power system by 2030. Fintan Slye, the chief executive of Neso, said: "There's no doubt that the challenges ahead on the journey to delivering clean power are great. However, if the scale of those challenges is matched with the bold, sustained actions that are outlined in this report, the benefits delivered could be even greater."


Data-Driven Sampling Based Stochastic MPC for Skid-Steer Mobile Robot Navigation

arXiv.org Artificial Intelligence

Traditional approaches to motion modeling for skid-steer robots struggle with capturing nonlinear tire-terrain dynamics, especially during high-speed maneuvers. In this paper, we tackle such nonlinearities by enhancing a dynamic unicycle model with Gaussian Process (GP) regression outputs. This enables us to develop an adaptive, uncertainty-informed navigation formulation. We solve the resultant stochastic optimal control problem using a chance-constrained Model Predictive Path Integral (MPPI) control method. This approach formulates both obstacle avoidance and path-following as chance constraints, accounting for residual uncertainties from the GP to ensure safety and reliability in control. Leveraging GPU acceleration, we efficiently manage the non-convex nature of the problem, ensuring real-time performance. Our approach unifies path-following and obstacle avoidance across different terrains, unlike prior works which typically focus on one or the other. We compare our GP-MPPI method against unicycle and data-driven kinematic models within the MPPI framework. In simulations, our approach shows superior tracking accuracy and obstacle avoidance. We further validate our approach through hardware experiments on a skid-steer robot platform, demonstrating its effectiveness in high-speed navigation. The GPU implementation of the proposed method and supplementary video footage are available at https: //stochasticmppi.github.io.


Digital Twin for Autonomous Surface Vessels: Enabler for Safe Maritime Navigation

arXiv.org Artificial Intelligence

Autonomous surface vessels (ASVs) are becoming increasingly significant in enhancing the safety and sustainability of maritime operations. To ensure the reliability of modern control algorithms utilized in these vessels, digital twins (DTs) provide a robust framework for conducting safe and effective simulations within a virtual environment. Digital twins are generally classified on a scale from 0 to 5, with each level representing a progression in complexity and functionality: Level 0 (Standalone) employs offline modeling techniques; Level 1 (Descriptive) integrates sensors and online modeling to enhance situational awareness; Level 2 (Diagnostic) focuses on condition monitoring and cybersecurity; Level 3 (Predictive) incorporates predictive analytics; Level 4 (Prescriptive) embeds decision-support systems; and Level 5 (Autonomous) enables advanced functionalities such as collision avoidance and path following. These digital representations not only provide insights into the vessel's current state and operational efficiency but also predict future scenarios and assess life endurance. By continuously updating with real-time sensor data, the digital twin effectively corrects modeling errors and enhances decision-making processes. Since DTs are key enablers for complex autonomous systems, this paper introduces a comprehensive methodology for establishing a digital twin framework specifically tailored for ASVs. Through a detailed literature survey, we explore existing state-of-the-art enablers across the defined levels, offering valuable recommendations for future research and development in this rapidly evolving field.


Real-Time Safe Bipedal Robot Navigation using Linear Discrete Control Barrier Functions

arXiv.org Artificial Intelligence

Safe navigation in real-time is an essential task for humanoid robots in real-world deployment. Since humanoid robots are inherently underactuated thanks to unilateral ground contacts, a path is considered safe if it is obstacle-free and respects the robot's physical limitations and underlying dynamics. Existing approaches often decouple path planning from gait control due to the significant computational challenge caused by the full-order robot dynamics. In this work, we develop a unified, safe path and gait planning framework that can be evaluated online in real-time, allowing the robot to navigate clustered environments while sustaining stable locomotion. Our approach uses the popular Linear Inverted Pendulum (LIP) model as a template model to represent walking dynamics. It incorporates heading angles in the model to evaluate kinematic constraints essential for physically feasible gaits properly. In addition, we leverage discrete control barrier functions (DCBF) for obstacle avoidance, ensuring that the subsequent foot placement provides a safe navigation path within clustered environments. To guarantee real-time computation, we use a novel approximation of the DCBF to produce linear DCBF (LDCBF) constraints. We validate the proposed approach in simulation using a Digit robot in randomly generated environments. The results demonstrate that our approach can generate safe gaits for a non-trivial humanoid robot to navigate environments with randomly generated obstacles in real-time.


Monocular Event-Based Vision for Obstacle Avoidance with a Quadrotor

arXiv.org Artificial Intelligence

We present the first static-obstacle avoidance method for quadrotors using just an onboard, monocular event camera. Quadrotors are capable of fast and agile flight in cluttered environments when piloted manually, but vision-based autonomous flight in unknown environments is difficult in part due to the sensor limitations of traditional onboard cameras. Event cameras, however, promise nearly zero motion blur and high dynamic range, but produce a very large volume of events under significant ego-motion and further lack a continuous-time sensor model in simulation, making direct sim-to-real transfer not possible. By leveraging depth prediction as a pretext task in our learning framework, we can pre-train a reactive obstacle avoidance events-to-control policy with approximated, simulated events and then fine-tune the perception component with limited events-and-depth real-world data to achieve obstacle avoidance in indoor and outdoor settings. We demonstrate this across two quadrotor-event camera platforms in multiple settings and find, contrary to traditional vision-based works, that low speeds (1m/s) make the task harder and more prone to collisions, while high speeds (5m/s) result in better event-based depth estimation and avoidance. We also find that success rates in outdoor scenes can be significantly higher than in certain indoor scenes.