Dengler, Nils
EvidMTL: Evidential Multi-Task Learning for Uncertainty-Aware Semantic Surface Mapping from Monocular RGB Images
Menon, Rohit, Dengler, Nils, Pan, Sicong, Chenchani, Gokul Krishna, Bennewitz, Maren
For scene understanding in unstructured environments, an accurate and uncertainty-aware metric-semantic mapping is required to enable informed action selection by autonomous systems. Existing mapping methods often suffer from overconfident semantic predictions, and sparse and noisy depth sensing, leading to inconsistent map representations. In this paper, we therefore introduce EvidMTL, a multi-task learning framework that uses evidential heads for depth estimation and semantic segmentation, enabling uncertainty-aware inference from monocular RGB images. To enable uncertainty-calibrated evidential multi-task learning, we propose a novel evidential depth loss function that jointly optimizes the belief strength of the depth prediction in conjunction with evidential segmentation loss. Building on this, we present EvidKimera, an uncertainty-aware semantic surface mapping framework, which uses evidential depth and semantics prediction for improved 3D metric-semantic consistency. We train and evaluate EvidMTL on the NYUDepthV2 and assess its zero-shot performance on ScanNetV2, demonstrating superior uncertainty estimation compared to conventional approaches while maintaining comparable depth estimation and semantic segmentation. In zero-shot mapping tests on ScanNetV2, EvidKimera outperforms Kimera in semantic surface mapping accuracy and consistency, highlighting the benefits of uncertainty-aware mapping and underscoring its potential for real-world robotic applications.
Pedestrians and Robots: A Novel Dataset for Learning Distinct Social Navigation Forces
Agrawal, Subham, Ostermann-Myrau, Nico, Dengler, Nils, Bennewitz, Maren
The increasing use of robots in human-centric public spaces such as shopping malls, sidewalks, and hospitals, requires understanding of how pedestrians respond to their presence. However, existing research lacks comprehensive datasets that capture the full range of pedestrian behaviors, e.g., including avoidance, neutrality, and attraction in the presence of robots. Such datasets can be used to effectively learn models capable of accurately predicting diverse responses of pedestrians to robot presence, which are crucial for advancing robot navigation strategies and optimizing pedestrian-aware motion planning. In this paper, we address these challenges by collecting a novel dataset of pedestrian motion in two outdoor locations under three distinct conditions, i.e., no robot presence, a stationary robot, and a moving robot. Thus, unlike existing datasets, ours explicitly encapsulates variations in pedestrian behavior across the different robot conditions. Using our dataset, we propose a novel Neural Social Robot Force Model (NSRFM), an extension of the traditional Social Force Model that integrates neural networks and robot-induced forces to better predict pedestrian behavior in the presence of robots. We validate the NSRFM by comparing its generated trajectories on different real-world datasets. Furthermore, we implemented it in simulation to enable the learning and benchmarking of robot navigation strategies based on their impact on pedestrian movement. Our results demonstrate the model's effectiveness in replicating real-world pedestrian reactions and its its utility in developing, evaluating, and benchmarking social robot navigation algorithms.
Map Space Belief Prediction for Manipulation-Enhanced Mapping
Marques, Joao Marcos Correia, Dengler, Nils, Zaenker, Tobias, Mucke, Jesper, Wang, Shenlong, Bennewitz, Maren, Hauser, Kris
Searching for objects in cluttered environments requires selecting efficient viewpoints and manipulation actions to remove occlusions and reduce uncertainty in object locations, shapes, and categories. In this work, we address the problem of manipulation-enhanced semantic mapping, where a robot has to efficiently identify all objects in a cluttered shelf. Although Partially Observable Markov Decision Processes~(POMDPs) are standard for decision-making under uncertainty, representing unstructured interactive worlds remains challenging in this formalism. To tackle this, we define a POMDP whose belief is summarized by a metric-semantic grid map and propose a novel framework that uses neural networks to perform map-space belief updates to reason efficiently and simultaneously about object geometries, locations, categories, occlusions, and manipulation physics. Further, to enable accurate information gain analysis, the learned belief updates should maintain calibrated estimates of uncertainty. Therefore, we propose Calibrated Neural-Accelerated Belief Updates (CNABUs) to learn a belief propagation model that generalizes to novel scenarios and provides confidence-calibrated predictions for unknown areas. Our experiments show that our novel POMDP planner improves map completeness and accuracy over existing methods in challenging simulations and successfully transfers to real-world cluttered shelves in zero-shot fashion.
Evaluating Robot Influence on Pedestrian Behavior Models for Crowd Simulation and Benchmarking
Agrawal, Subham, Dengler, Nils, Bennewitz, Maren
The presence of robots amongst pedestrians affects them causing deviation to their trajectories. Existing methods suffer from the limitation of not being able to objectively measure this deviation in unseen cases. In order to solve this issue, we introduce a simulation framework that repetitively measures and benchmarks the deviation in trajectory of pedestrians due to robots driven by different navigation algorithms. We simulate the deviation behavior of the pedestrians using an enhanced Social Force Model (SFM) with a robot force component that accounts for the influence of robots on pedestrian behavior, resulting in the Social Robot Force Model (SRFM). Parameters for this model are learned using the pedestrian trajectories from the JRDB dataset [1]. Pedestrians are then simulated using the SRFM with and without the robot force component to objectively measure the deviation to their trajectory caused by the robot in 5 different scenarios. Our work in this paper is a proof of concept that shows objectively measuring the pedestrian reaction to robot is possible. We use our simulation to train two different RL policies and evaluate them against traditional navigation models.
RHINO-VR Experience: Teaching Mobile Robotics Concepts in an Interactive Museum Exhibit
Schlachhoff, Erik, Dengler, Nils, Van Holland, Leif, Stotko, Patrick, de Heuvel, Jorge, Klein, Reinhard, Bennewitz, Maren
In 1997, the very first tour guide robot RHINO was deployed in a museum in Germany. With the ability to navigate autonomously through the environment, the robot gave tours to over 2,000 visitors. Today, RHINO itself has become an exhibit and is no longer operational. In this paper, we present RHINO-VR, an interactive museum exhibit using virtual reality (VR) that allows museum visitors to experience the historical robot RHINO in operation in a virtual museum. RHINO-VR, unlike static exhibits, enables users to familiarize themselves with basic mobile robotics concepts without the fear of damaging the exhibit. In the virtual environment, the user is able to interact with RHINO in VR by pointing to a location to which the robot should navigate and observing the corresponding actions of the robot. To include other visitors who cannot use the VR, we provide an external observation view to make RHINO visible to them. We evaluated our system by measuring the frame rate of the VR simulation, comparing the generated virtual 3D models with the originals, and conducting a user study. The user-study showed that RHINO-VR improved the visitors' understanding of the robot's functionality and that they would recommend experiencing the VR exhibit to others.
Constrained Object Placement Using Reinforcement Learning
Kreis, Benedikt, Dengler, Nils, de Heuvel, Jorge, Menon, Rohit, Perur, Hamsa Datta, Bennewitz, Maren
Close and precise placement of irregularly shaped objects requires a skilled robotic system. Particularly challenging is the manipulation of objects that have sensitive top surfaces and a fixed set of neighbors. To avoid damaging the surface, they have to be grasped from the side, and during placement, their neighbor relations have to be maintained. In this work, we train a reinforcement learning agent that generates smooth end-effector motions to place objects as close as possible next to each other. During the placement, our agent considers neighbor constraints defined in a given layout of the objects while trying to avoid collisions. Our approach learns to place compact object assemblies without the need for predefined spacing between objects as required by traditional methods. We thoroughly evaluated our approach using a two-finger gripper mounted to a robotic arm with six degrees of freedom. The results show that our agent outperforms two baseline approaches in terms of object assembly compactness, thereby reducing the needed space to place the objects according to the given neighbor constraints. On average, our approach reduces the distances between all placed objects by at least 60%, with fewer collisions at the same compactness compared to both baselines.
Learning Goal-Directed Object Pushing in Cluttered Scenes with Location-Based Attention
Dengler, Nils, Ferrandis, Juan Del Aguila, Moura, Joรฃo, Vijayakumar, Sethu, Bennewitz, Maren
Non-prehensile planar pushing is a challenging task due to its underactuated nature with hybrid-dynamics, where a robot needs to reason about an object's long-term behaviour and contact-switching, while being robust to contact uncertainty. The presence of clutter in the environment further complicates this task, introducing the need to include more sophisticated spatial analysis to avoid collisions. Building upon prior work on reinforcement learning (RL) with multimodal categorical exploration for planar pushing, in this paper we incorporate location-based attention to enable robust navigation through clutter. Unlike previous RL literature addressing this obstacle avoidance pushing task, our framework requires no predefined global paths and considers the target orientation of the manipulated object. Our results demonstrate that the learned policies successfully navigate through a wide range of complex obstacle configurations, including dynamic obstacles, with smooth motions, achieving the desired target object pose. We also validate the transferability of the learned policies to robotic hardware using the KUKA iiwa robot arm.
Integrating One-Shot View Planning with a Single Next-Best View via Long-Tail Multiview Sampling
Pan, Sicong, Hu, Hao, Wei, Hui, Dengler, Nils, Zaenker, Tobias, Dawood, Murad, Bennewitz, Maren
Existing view planning systems either adopt an iterative paradigm using next-best views (NBV) or a one-shot pipeline relying on the set-covering view-planning (SCVP) network. However, neither of these methods can concurrently guarantee both high-quality and high-efficiency reconstruction of 3D unknown objects. To tackle this challenge, we introduce a crucial hypothesis: with the availability of more information about the unknown object, the prediction quality of the SCVP network improves. There are two ways to provide extra information: (1) leveraging perception data obtained from NBVs, and (2) training on an expanded dataset of multiview inputs. In this work, we introduce a novel combined pipeline that incorporates a single NBV before activating the proposed multiview-activated (MA-)SCVP network. The MA-SCVP is trained on a multiview dataset generated by our long-tail sampling method, which addresses the issue of unbalanced multiview inputs and enhances the network performance. Extensive simulated experiments substantiate that our system demonstrates a significant surface coverage increase and a substantial 45% reduction in movement cost compared to state-of-the-art systems. Real-world experiments justify the capability of our system for generalization and deployment.
Safe Multi-Agent Reinforcement Learning for Formation Control without Individual Reference Targets
Dawood, Murad, Pan, Sicong, Dengler, Nils, Zhou, Siqi, Schoellig, Angela P., Bennewitz, Maren
Abstract--In recent years, formation control of unmanned vehicles has received considerable interest, driven by the progress in autonomous systems and the imperative for multiple vehicles to carry out diverse missions. In this paper, we address the problem of behavior-based formation control of mobile robots, where we use safe multi-agent reinforcement learning (MARL) to ensure the safety of the robots by eliminating all collisions during training and execution. To ensure safety, we implemented distributed model predictive control safety filters to override unsafe actions. We focus on achieving behavior-based formation without having individual reference targets for the robots, and instead use targets for the centroid of the formation. This formulation facilitates the deployment of formation control on real robots and improves the scalability of our approach to Figure 1: Real-world example for behavior-based formation control of more robots. The task cannot be addressed through optimizationbased mobile robots based on centroid reference targets. The formation is controllers without specific individual reference targets for defined by the distances between the three robots. The robots start the robots and information about the relative locations of each from random locations and then navigate cooperatively to move the robot to the others. That is why, for our formulation we use target for the centroid of the formation while aiming to maintain the MARL to train the robots. Moreover, in order to account for the predefined distances with respect to each other. The centroid of the interactions between the agents, we use attention-based critics to formation reaches the first goal and is then moved to the second goal improve the training process.
DawnIK: Decentralized Collision-Aware Inverse Kinematics Solver for Heterogeneous Multi-Arm Systems
Marangoz, Salih, Menon, Rohit, Dengler, Nils, Bennewitz, Maren
Although inverse kinematics of serial manipulators is a well studied problem, challenges still exist in finding smooth feasible solutions that are also collision aware. Furthermore, with collaborative service robots gaining traction, different robotic systems have to work in close proximity. This means that the current inverse kinematics approaches do not have only to avoid collisions with themselves but also collisions with other robot arms. Therefore, we present a novel approach to compute inverse kinematics for serial manipulators that take into account different constraints while trying to reach a desired end-effector pose that avoids collisions with themselves and other arms. Unlike other constraint based approaches, we neither perform expensive inverse Jacobian computations nor do we require arms with redundant degrees of freedom. Instead, we formulate different constraints as weighted cost functions to be optimized by a non-linear optimization solver. Our approach is superior to the state-of-the-art CollisionIK in terms of collision avoidance in the presence of multiple arms in confined spaces with no collisions occurring in all the experimental scenarios. When the probability of collision is low, our approach shows better performance at trajectory tracking as well. Additionally, our approach is capable of simultaneous yet decentralized control of multiple arms for trajectory tracking in intersecting workspace without any collisions.