Tidy-up tasks by service robots in home environments are challenging in the application of robotics because they involve various interactions with the environment. In particular, robots are required not only to grasp, move, and release various home objects, but also plan the order and positions where to put them away. In this paper, we propose a novel planning method that can efficiently estimate the order and positions of the objects to be tidied up based on the learning of the parameters of a probabilistic generative model. The model allows the robot to learn the distributions of co-occurrence probability of objects and places to tidy up by using multimodal sensor information collected in a tidied environment. Additionally, we develop an autonomous robotic system to perform the tidy-up operation. We evaluate the effectiveness of the proposed method in an experimental simulation that reproduces the conditions of the Tidy Up Here task of the World Robot Summit international robotics competition. The simulation results showed that the proposed method enables the robot to successively tidy up several objects and achieves the best task score compared to baseline tidy-up methods.
In this paper, we consider same-day delivery with a heterogeneous fleet of vehicles and drones. Customers make delivery requests over the course of the day and the dispatcher dynamically dispatches vehicles and drones to deliver the goods to customers before their delivery deadline. Vehicles can deliver multiple packages in one route but travel relatively slowly due to the urban traffic. Drones travel faster, but they have limited capacity and require charging or battery swaps. To exploit the different strengths of the fleets, we propose a deep Q-learning approach. Our method learns the value of assigning a new customer to either drones or vehicles as well as the option to not offer service at all. To aid feature selection, we present an analytical analysis that demonstrates the role that different types of information have on the value function and decision making. In a systematic computational analysis, we show the superiority of our policy compared to benchmark policies and the effectiveness of our deep Q-learning approach.
The Great Elephant Census, conducted in 2014 and 2015, counted more than 350,000* elephants across 18 African countries. Human observers in small planes flew some 294,000 kilometers during more than 1,500 hours to systematically count the animals. Could a future census be managed locally, using unmanned aerial vehicles (UAVs, a.k.a. Although surveying the large animals in their individual reserves is a smaller job than the Great Elephant Census, such surveys cost managers substantial time and money. A Swiss research team recently tested a new approach to wildlife surveys.
Perception-Action loops are at the core of most our daily life activities. Subconsciously, our brains use sensory inputs to trigger specific motor actions in real time and this becomes a continuous activity that in all sorts of activities from playing sports to watching TV. In the context of artificial intelligence(AI), perception-action loops are the cornerstone of autonomous systems such as self-driving vehicles. While disciplines such as imitation learning or reinforcement learning have certainly made progress in this area, the current generation of autonomous systems are still nowhere near human skill in making those decisions directly from visual data. Recently, AI researchers from Microsoft published a paper proposing a transfer learning method to learn perception-action policies from in a simulated environment and apply the knowledge to fly an autonomous drone.