Goto

Collaborating Authors

 turtlebot


Real World Robotic Exploration using Deep Neural Networks Trained in Photorealistic Reconstructed Environments

Ward, Isaac Ronald

arXiv.org Artificial Intelligence

In this work, an existing deep neural network approach for determining a robot's pose from visual information (RGB images) is modified, improving its localization performance without impacting its ease of training. Explicitly, the network's loss function is extended in a manner which intuitively combines the positional and rotational error in order to increase robustness to perceptual aliasing. An improvement in the localization accuracy for indoor scenes is observed: with decreases of up to 9.64% and 2.99% in the median positional and rotational error respectively, when compared to the unmodified network. Additionally, photogrammetry data is used to produce a pose-labelled dataset which allows the above model to be trained on a local environment, resulting in localization accuracies of 0.11m & 0.89 degrees. This trained model forms the basis of a navigation algorithm, which is tested in real-time on a TurtleBot (a wheeled robotic device). As such, this work introduces a full pipeline for creating a robust navigational algorithm for any given real world indoor scene; the only requirement being a collection of images from the scene, which can be captured in as little as 330 seconds of


Auditory Localization and Assessment of Consequential Robot Sounds: A Multi-Method Study in Virtual Reality

Wessels, Marlene, de Heuvel, Jorge, Müller, Leon, Maier, Anna Luisa, Bennewitz, Maren, Kraus, Johannes

arXiv.org Artificial Intelligence

-- Mobile robots increasingly operate alongside humans but are often out of sight, so that humans need to rely on the sounds of the robots to recognize their presence. For successful human-robot interaction (HRI), it is therefore crucial to understand how humans perceive robots by their consequential sounds, i.e., operating noise. Prior research suggests that the sound of a quadruped Go1 is more detectable than that of a wheeled T urtlebot. This study builds on this and examines the human ability to localize consequential sounds of three robots (quadruped Go1, wheeled T urtlebot 2i, wheeled HSR) in Virtual Reality. In a within-subjects design, we assessed participants' localization performance for the robots with and without an acoustic vehicle alerting system (A V AS) for two velocities (0.3, 0.8 m/s) and two trajectories (head-on, radial). In each trial, participants were presented with the sound of a moving robot for 3 s and were tasked to point at its final position (localization task). Localization errors were measured as the absolute angular difference between the participants' estimated and the actual robot position. Results showed that the robot type significantly influenced the localization accuracy and precision, with the sound of the wheeled HSR (especially without A V AS) performing worst under all experimental conditions. Surprisingly, participants rated the HSR sound as more positive, less annoying, and more trustworthy than the T urtlebot and Go1 sound. This reveals a tension between subjective evaluation and objective auditory localization performance. Our findings highlight consequential robot sounds as a critical factor for designing intuitive and effective HRI, with implications for human-centered robot design and social navigation.


Online Concurrent Multi-Robot Coverage Path Planning

Mitra, Ratijit, Saha, Indranil

arXiv.org Artificial Intelligence

Recently, centralized receding horizon online multi-robot coverage path planning algorithms have shown remarkable scalability in thoroughly exploring large, complex, unknown workspaces with many robots. In a horizon, the path planning and the path execution interleave, meaning when the path planning occurs for robots with no paths, the robots with outstanding paths do not execute, and subsequently, when the robots with new or outstanding paths execute to reach respective goals, path planning does not occur for those robots yet to get new paths, leading to wastage of both the robotic and the computation resources. As a remedy, we propose a centralized algorithm that is not horizon-based. It plans paths at any time for a subset of robots with no paths, i.e., who have reached their previously assigned goals, while the rest execute their outstanding paths, thereby enabling concurrent planning and execution. We formally prove that the proposed algorithm ensures complete coverage of an unknown workspace and analyze its time complexity. To demonstrate scalability, we evaluate our algorithm to cover eight large $2$D grid benchmark workspaces with up to 512 aerial and ground robots, respectively. A comparison with a state-of-the-art horizon-based algorithm shows its superiority in completing the coverage with up to 1.6x speedup. For validation, we perform ROS + Gazebo simulations in six 2D grid benchmark workspaces with 10 quadcopters and TurtleBots, respectively. We also successfully conducted one outdoor experiment with three quadcopters and one indoor with two TurtleBots.


Asynchronous, Option-Based Multi-Agent Policy Gradient: A Conditional Reasoning Approach

Lyu, Xubo, Banitalebi-Dehkordi, Amin, Chen, Mo, Zhang, Yong

arXiv.org Artificial Intelligence

Cooperative multi-agent problems often require coordination between agents, which can be achieved through a centralized policy that considers the global state. Multi-agent policy gradient (MAPG) methods are commonly used to learn such policies, but they are often limited to problems with low-level action spaces. In complex problems with large state and action spaces, it is advantageous to extend MAPG methods to use higher-level actions, also known as options, to improve the policy search efficiency. However, multi-robot option executions are often asynchronous, that is, agents may select and complete their options at different time steps. This makes it difficult for MAPG methods to derive a centralized policy and evaluate its gradient, as centralized policy always select new options at the same time. In this work, we propose a novel, conditional reasoning approach to address this problem and demonstrate its effectiveness on representative option-based multi-agent cooperative tasks through empirical validation. Find code and videos at: \href{https://sites.google.com/view/mahrlsupp/}{https://sites.google.com/view/mahrlsupp/}


On-Robot Bayesian Reinforcement Learning for POMDPs

Nguyen, Hai, Katt, Sammie, Xiao, Yuchen, Amato, Christopher

arXiv.org Artificial Intelligence

Robot learning is often difficult due to the expense of gathering data. The need for large amounts of data can, and should, be tackled with effective algorithms and leveraging expert information on robot dynamics. Bayesian reinforcement learning (BRL), thanks to its sample efficiency and ability to exploit prior knowledge, is uniquely positioned as such a solution method. Unfortunately, the application of BRL has been limited due to the difficulties of representing expert knowledge as well as solving the subsequent inference problem. This paper advances BRL for robotics by proposing a specialized framework for physical systems. In particular, we capture this knowledge in a factored representation, then demonstrate the posterior factorizes in a similar shape, and ultimately formalize the model in a Bayesian framework. We then introduce a sample-based online solution method, based on Monte-Carlo tree search and particle filtering, specialized to solve the resulting model. This approach can, for example, utilize typical low-level robot simulators and handle uncertainty over unknown dynamics of the environment. We empirically demonstrate its efficiency by performing on-robot learning in two human-robot interaction tasks with uncertainty about human behavior, achieving near-optimal performance after only a handful of real-world episodes. A video of learned policies is at https://youtu.be/H9xp60ngOes.


Autonomous Systems: Autonomous Systems: Indoor Drone Navigation

Iyer, Aswin, Narayan, Santosh, M, Naren, Rajagopal, Manoj kumar

arXiv.org Artificial Intelligence

Drones are a promising technology for autonomous data collection and indoor sensing. In situations when human-controlled UAVs may not be practical or dependable, such as in uncharted or dangerous locations, the usage of autonomous UAVs offers flexibility, cost savings, and reduced risk. The system creates a simulated quadcopter capable of autonomously travelling in an indoor environment using the gazebo simulation tool and the ros navigation system framework known as Navigaation2. While Nav2 has successfully shown the functioning of autonomous navigation in terrestrial robots and vehicles, the same hasn't been accomplished with unmanned aerial vehicles and still has to be done. The goal is to use the slam toolbox for ROS and the Nav2 navigation system framework to construct a simulated drone that can move autonomously in an indoor (gps-less) environment.


Enhancing Deep Learning with Scenario-Based Override Rules: a Case Study

Ashrov, Adiel, Katz, Guy

arXiv.org Artificial Intelligence

Deep neural networks (DNNs) have become a crucial instrument in the software development toolkit, due to their ability to efficiently solve complex problems. Nevertheless, DNNs are highly opaque, and can behave in an unexpected manner when they encounter unfamiliar input. One promising approach for addressing this challenge is by extending DNN-based systems with hand-crafted override rules, which override the DNN's output when certain conditions are met. Here, we advocate crafting such override rules using the well-studied scenario-based modeling paradigm, which produces rules that are simple, extensible, and powerful enough to ensure the safety of the DNN, while also rendering the system more translucent. We report on two extensive case studies, which demonstrate the feasibility of the approach; and through them, propose an extension to scenario-based modeling, which facilitates its integration with DNN components. We regard this work as a step towards creating safer and more reliable DNN-based systems and models.


Efficient algorithms for autonomous electric vehicles' min-max routing problem

Fazeli, Seyed Sajjad, Venkatachalam, Saravanan, Smereka, Jonathon M.

arXiv.org Artificial Intelligence

Increase in greenhouse gases emission from the transportation sector has led companies and government to elevate and support the production of electric vehicles. The natural synergy between increased support for electric and emergence of autonomous vehicles possibly can relieve the limitations regarding access to charging infrastructure, time management, and range anxiety. In this work, a fleet of Autonomous Electric Vehicles (AEV) is considered for transportation and logistic capabilities with limited battery capacity and scarce charging station availability are considered while planning to avoid inefficient routing strategies. We introduce a min-max autonomous electric vehicle routing problem (AEVRP) where the maximum distance traveled by any AEV is minimized while considering charging stations for recharging. We propose a genetic algorithm based meta-heuristic that can efficiently solve a variety of instances. Extensive computational results, sensitivity analysis, and data-driven simulation implemented with the robot operating system (ROS) middleware are performed to corroborate the efficiency of the proposed approach, both quantitatively and qualitatively.


Macro-Action-Based Deep Multi-Agent Reinforcement Learning

Xiao, Yuchen, Hoffman, Joshua, Amato, Christopher

arXiv.org Artificial Intelligence

In real-world multi-robot systems, performing high-quality, collaborative behaviors requires robots to asynchronously reason about high-level action selection at varying time durations. Macro-Action Decentralized Partially Observable Markov Decision Processes (MacDec-POMDPs) provide a general framework for asynchronous decision making under uncertainty in fully cooperative multi-agent tasks. However, multi-agent deep reinforcement learning methods have only been developed for (synchronous) primitive-action problems. This paper proposes two Deep Q-Network (DQN) based methods for learning decentralized and centralized macro-action-value functions with novel macro-action trajectory replay buffers introduced for each case. Evaluations on benchmark problems and a larger domain demonstrate the advantage of learning with macro-actions over primitive-actions and the scalability of our approaches.


Multi-Robot Deep Reinforcement Learning with Macro-Actions

Xiao, Yuchen, Hoffman, Joshua, Xia, Tian, Amato, Christopher

arXiv.org Artificial Intelligence

A. MacDec-POMDPs Decentralized fully collaborative multi-agent decision-making under uncertainty can be modeled as a decentralized POMDP (Dec-POMDP) [14]. Due to the assumption of synchronous actions that require the same amount of time for each agent, Dec-POMDPs are not applicable to multi-robot planning and learning scenarios in real-world. MacDec-POMDPs, formalized by introducing macro-actions into Dec-POMDPs, inherently allow asynchronous execution among robots with temporally extended macro-actions that can begin and end at different times for each agent. Formally, a MacDec-POMDP is defined as a tuple nullI,S,A, Ω,M,ζ,O,T,Z,R null, where I is a finite set of agents; S is a finite set of environment states; A iA i and Ω iΩ i are the spaces of joint-primitive-action and joint-primitive-observation respectively; M iM i is the joint set of each agent's finite macro-action space M i; ζ iζ i is the set of joint macro-observations over agents' finite macro-observation space ζ i. Given a macro-action- based policy, each agent i is allowed to asynchronously choose a macro-action m i nullβ m,I m,π m null i that depends on individual macro-action-observation histories, where β m: H A i [0, 1] is the stochastic termination condition and I m H M i is the initiation set of the corresponding macro-action m i, respectively depending on the primitive-action- observation history space H A i and macro-action-observation history space H M i of agent i; π m: H A i A i denotes the low-level policy to achieve the macro-action m, and during the execution, each agent's primitive-observation o i Ω i is generated according to probability observation function O i(o i,a i,s) Pr( o i a i,s), and a shared immediate reward r ( s,null a), where null a A iA i, is issued according to the reward function R: S A R .