Magnusson, Martin
Learning Extrinsic Dexterity with Parameterized Manipulation Primitives
Yang, Shih-Min, Magnusson, Martin, Stork, Johannes A., Stoyanov, Todor
Many practically relevant robot grasping problems feature a target object for which all grasps are occluded, e.g., by the environment. Single-shot grasp planning invariably fails in such scenarios. Instead, it is necessary to first manipulate the object into a configuration that affords a grasp. We solve this problem by learning a sequence of actions that utilize the environment to change the object's pose. Concretely, we employ hierarchical reinforcement learning to combine a sequence of learned parameterized manipulation primitives. By learning the low-level manipulation policies, our approach can control the object's state through exploiting interactions between the object, the gripper, and the environment. Designing such a complex behavior analytically would be infeasible under uncontrolled conditions, as an analytic approach requires accurate physical modeling of the interaction and contact dynamics. In contrast, we learn a hierarchical policy model that operates directly on depth perception data, without the need for object detection, pose estimation, or manual design of controllers. We evaluate our approach on picking box-shaped objects of various weight, shape, and friction properties from a constrained table-top workspace. Our method transfers to a real robot and is able to successfully complete the object picking task in 98\% of experimental trials.
Do we need scan-matching in radar odometry?
Kubelka, Vladimír, Fritz, Emil, Magnusson, Martin
There is a current increase in the development of "4D" Doppler-capable radar and lidar range sensors that produce 3D point clouds where all points also have information about the radial velocity relative to the sensor. 4D radars in particular are interesting for object perception and navigation in low-visibility conditions (dust, smoke) where lidars and cameras typically fail. With the advent of high-resolution Doppler-capable radars comes the possibility of estimating odometry from single point clouds, foregoing the need for scan registration which is error-prone in feature-sparse field environments. We compare several odometry estimation methods, from direct integration of Doppler/IMU data and Kalman filter sensor fusion to 3D scan-to-scan and scan-to-map registration, on three datasets with data from two recent 4D radars and two IMUs. Surprisingly, our results show that the odometry from Doppler and IMU data alone give similar or better results than 3D point cloud registration. In our experiments, the average position error can be as low as 0.3% over 1.8 and 4.5km trajectories. That allows accurate estimation of 6DOF ego-motion over long distances also in feature-sparse mine environments. These results are useful not least for applications of navigation with resource-constrained robot platforms in feature-sparse and low-visibility conditions such as mining, construction, and search & rescue operations.
Doppler-only Single-scan 3D Vehicle Odometry
Galeote-Luque, Andres, Kubelka, Vladimír, Magnusson, Martin, Ruiz-Sarmiento, Jose-Raul, Gonzalez-Jimenez, Javier
We present a novel 3D odometry method that recovers the full motion of a vehicle only from a Doppler-capable range sensor. It leverages the radial velocities measured from the scene, estimating the sensor's velocity from a single scan. The vehicle's 3D motion, defined by its linear and angular velocities, is calculated taking into consideration its kinematic model which provides a constraint between the velocity measured at the sensor frame and the vehicle frame. Experiments carried out prove the viability of our single-sensor method compared to mounting an additional IMU. Our method provides the translation of the sensor, which cannot be reliably determined from an IMU, as well as its rotation. Its short-term accuracy and fast operation (~5ms) make it a proper candidate to supply the initialization to more complex localization algorithms or mapping pipelines. Not only does it reduce the error of the mapper, but it does so at a comparable level of accuracy as an IMU would. All without the need to mount and calibrate an extra sensor on the vehicle.
RadaRays: Real-time Simulation of Rotating FMCW Radar for Mobile Robotics via Hardware-accelerated Ray Tracing
Mock, Alexander, Magnusson, Martin, Hertzberg, Joachim
RadaRays allows for the accurate modeling and simulation of rotating FMCW radar sensors in complex environments, including the simulation of reflection, refraction, and scattering of radar waves. Our software is able to handle large numbers of objects and materials, making it suitable for use in a variety of mobile robotics applications. We demonstrate the effectiveness of RadaRays through a series of experiments and show that it can more accurately reproduce the behavior of FMCW radar sensors in a variety of environments, compared to the ray casting-based lidar-like simulations that are commonly used in simulators for autonomous driving such as CARLA. Our experiments additionally serve as valuable reference point for researchers to evaluate their own radar simulations. By using RadaRays, developers can significantly reduce the time and cost associated with prototyping and testing FMCW radar-based algorithms. We also provide a Gazebo plugin that makes our work accessible to the mobile robotics community.
CLiFF-LHMP: Using Spatial Dynamics Patterns for Long-Term Human Motion Prediction
Zhu, Yufei, Rudenko, Andrey, Kucner, Tomasz P., Palmieri, Luigi, Arras, Kai O., Lilienthal, Achim J., Magnusson, Martin
Human motion prediction is important for mobile service robots and intelligent vehicles to operate safely and smoothly around people. The more accurate predictions are, particularly over extended periods of time, the better a system can, e.g., assess collision risks and plan ahead. In this paper, we propose to exploit maps of dynamics (MoDs, a class of general representations of place-dependent spatial motion patterns, learned from prior observations) for long-term human motion prediction (LHMP). We present a new MoD-informed human motion prediction approach, named CLiFF-LHMP, which is data efficient, explainable, and insensitive to errors from an upstream tracking system. Our approach uses CLiFF-map, a specific MoD trained with human motion data recorded in the same environment. We bias a constant velocity prediction with samples from the CLiFF-map to generate multi-modal trajectory predictions. In two public datasets we show that this algorithm outperforms the state of the art for predictions over very extended periods of time, achieving 45% more accurate prediction performance at 50s compared to the baseline.
Advantages of Multimodal versus Verbal-Only Robot-to-Human Communication with an Anthropomorphic Robotic Mock Driver
Schreiter, Tim, Morillo-Mendez, Lucas, Chadalavada, Ravi T., Rudenko, Andrey, Billing, Erik, Magnusson, Martin, Arras, Kai O., Lilienthal, Achim J.
Robots are increasingly used in shared environments with humans, making effective communication a necessity for successful human-robot interaction. In our work, we study a crucial component: active communication of robot intent. Here, we present an anthropomorphic solution where a humanoid robot communicates the intent of its host robot acting as an "Anthropomorphic Robotic Mock Driver" (ARMoD). We evaluate this approach in two experiments in which participants work alongside a mobile robot on various tasks, while the ARMoD communicates a need for human attention, when required, or gives instructions to collaborate on a joint task. The experiments feature two interaction styles of the ARMoD: a verbal-only mode using only speech and a multimodal mode, additionally including robotic gaze and pointing gestures to support communication and register intent in space. Our results show that the multimodal interaction style, including head movements and eye gaze as well as pointing gestures, leads to more natural fixation behavior. Participants naturally identified and fixated longer on the areas relevant for intent communication, and reacted faster to instructions in collaborative tasks. Our research further indicates that the ARMoD intent communication improves engagement and social interaction with mobile robots in workplace settings.
A Data-Efficient Approach for Long-Term Human Motion Prediction Using Maps of Dynamics
Zhu, Yufei, Rudenko, Andrey, Kucner, Tomasz P., Lilienthal, Achim J., Magnusson, Martin
Human motion prediction is essential for the safe and smooth operation of mobile service robots and intelligent vehicles around people. Commonly used neural network-based approaches often require large amounts of complete trajectories to represent motion dynamics in complex semantically-rich spaces. This requirement may complicate deployment of physical systems in new environments, especially when the data is being collected online from onboard sensors. In this paper we explore a data-efficient alternative using maps of dynamics (MoD) to represent place-dependent multi-modal spatial motion patterns, learned from prior observations. Our approach can perform efficient human motion prediction in the long-term perspective of up to 60 seconds. We quantitatively evaluate its accuracy with limited amount of training data in comparison to an LSTM-based baseline, and qualitatively show that the predicted trajectories reflect the natural semantic properties of the environment, e.g. the locations of short- and long-term goals, navigation in narrow passages, around obstacles, etc.
Lidar-level localization with radar? The CFEAR approach to accurate, fast and robust large-scale radar odometry in diverse environments
Adolfsson, Daniel, Magnusson, Martin, Alhashimi, Anas, Lilienthal, Achim J., Andreasson, Henrik
This paper presents an accurate, highly efficient, and learning-free method for large-scale odometry estimation using spinning radar, empirically found to generalize well across very diverse environments -- outdoors, from urban to woodland, and indoors in warehouses and mines - without changing parameters. Our method integrates motion compensation within a sweep with one-to-many scan registration that minimizes distances between nearby oriented surface points and mitigates outliers with a robust loss function. Extending our previous approach CFEAR, we present an in-depth investigation on a wider range of data sets, quantifying the importance of filtering, resolution, registration cost and loss functions, keyframe history, and motion compensation. We present a new solving strategy and configuration that overcomes previous issues with sparsity and bias, and improves our state-of-the-art by 38%, thus, surprisingly, outperforming radar SLAM and approaching lidar SLAM. The most accurate configuration achieves 1.09% error at 5Hz on the Oxford benchmark, and the fastest achieves 1.79% error at 160Hz.
TBV Radar SLAM -- trust but verify loop candidates
Adolfsson, Daniel, Karlsson, Mattias, Kubelka, Vladimír, Magnusson, Martin, Andreasson, Henrik
Robust SLAM in large-scale environments requires fault resilience and awareness at multiple stages, from sensing and odometry estimation to loop closure. In this work, we present TBV (Trust But Verify) Radar SLAM, a method for radar SLAM that introspectively verifies loop closure candidates. TBV Radar SLAM achieves a high correct-loop-retrieval rate by combining multiple place-recognition techniques: tightly coupled place similarity and odometry uncertainty search, creating loop descriptors from origin-shifted scans, and delaying loop selection until after verification. Robustness to false constraints is achieved by carefully verifying and selecting the most likely ones from multiple loop constraints. Importantly, the verification and selection are carried out after registration when additional sources of loop evidence can easily be computed. We integrate our loop retrieval and verification method with a fault-resilient odometry pipeline within a pose graph framework. By evaluating on public benchmarks we found that TBV Radar SLAM achieves 65% lower error than the previous state of the art. We also show that it's generalizing across environments without needing to change any parameters.