Goto

Collaborating Authors

 Undirected Networks


Robot Bed-Making: Deep Transfer Learning Using Depth Sensing of Deformable Fabric

arXiv.org Artificial Intelligence

Abstract-- Bed-making is a common task well-suited for home robots since it is tolerant to error and not time-critical. Bed-making can also be difficult for senior citizens and those with limited mobility due to the bending and reaching movements required. Autonomous bed-making combines multiple challenges in robotics: perception in unstructured environments, deformable object manipulation, transfer learning, and sequential decision making. We formalize the bed-making problem as one of maximizing surface coverage with a blanket, and explore algorithmic approaches that use deep learning on depth images to be invariant to the color and pattern of the blankets. We train two networks: one to identify a corner of the blanket and another to determine when to transition to the other side of the bed. Using the first network, the robot grasps at its estimate of the blanket corner and then pulls it to the appropriate corner of the bed frame. The second network estimates if the robot has sufficiently covered one side and can transition to the other, or if it should attempt another grasp from the same side. We evaluate with two robots, the Toyota HSR and the Fetch, and three blankets. Using 2018 and 654 depth images for training the grasp and transition networks respectively, experiments with a quarter-scale twin bed achieve an average of 91.7% blanket coverage, nearly matching human supervisors with 95.0% coverage. Data is available at https: //sites.google.com/view/bed-make. A common home task is bed-making [4], which is rarely enjoyed and can be physically challenging due to bending and leaning movements. Surveys of older adults in the United States [9], [3], suggest that they are willing to have a robot assistant in their homes, particularly for physically demanding tasks.


Distributed Wildfire Surveillance with Autonomous Aircraft using Deep Reinforcement Learning

arXiv.org Artificial Intelligence

Teams of autonomous unmanned aircraft can be used to monitor wildfires, enabling firefighters to make informed decisions. However, controlling multiple autonomous fixed-wing aircraft to maximize forest fire coverage is a complex problem. The state space is high dimensional, the fire propagates stochastically, the sensor information is imperfect, and the aircraft must coordinate with each other to accomplish their mission. This work presents two deep reinforcement learning approaches for training decentralized controllers that accommodate the high dimensionality and uncertainty inherent in the problem. The first approach controls the aircraft using immediate observations of the individual aircraft. The second approach allows aircraft to collaborate on a map of the wildfire's state and maintain a time history of locations visited, which are used as inputs to the controller. Simulation results show that both approaches allow the aircraft to accurately track wildfire expansions and outperform an online receding horizon controller. Additional simulations demonstrate that the approach scales with different numbers of aircraft and generalizes to different wildfire shapes.


Semi-supervised Deep Reinforcement Learning in Support of IoT and Smart City Services

arXiv.org Artificial Intelligence

Abstract--Smart services are an important element of the smart cities and the Internet of Things (IoT) ecosystems where the intelligence behind the services is obtained and improved through the sensory data. Providing a large amount of training data is not always feasible; therefore, we need to consider alternative ways that incorporate unlabeled data as well. In recent years, Deep reinforcement learning (DRL) has gained great success in several application domains. It is an applicable method for IoT and smart city scenarios where auto-generated data can be partially labeled by users' feedback for training purposes. In this paper, we propose a semi-supervised deep reinforcement learning model that fits smart city applications as it consumes both labeled and unlabeled data to improve the performance and accuracy of the learning agent. To the best of our knowledge, the proposed model is the first investigation that extends deep reinforcement learning to the semi-supervised paradigm. As a case study of smart city applications, we focus on smart buildings and apply the proposed model to the problem of indoor localization based on BLE signal strength. Indoor localization is the main component of smart city services since people spend significant time in indoor environments. Our model learns the best action policies that lead to a close estimation of the target locations with an improvement of 23% in terms of distance to the target and at least 67% more received rewards compared to the supervised DRL model. The rapid development of Internet of Things (IoT) technologies motivated researchers and developers to think about new kinds of smart services that extract knowledge from IoT generated data. The scarcity of labeled data is a main issue for developing such solutions especially for IoT applications where a large number of sensors participate in generating data without being able to obtain class labels corresponding to the collected data. This publication was made possible by NPRP grant# [71113-1-199] from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors.


Robot Representing and Reasoning with Knowledge from Reinforcement Learning

arXiv.org Artificial Intelligence

Reinforcement learning (RL) agents aim at learning by interacting with an environment, and are not designed for representing or reasoning with declarative knowledge. Knowledge representation and reasoning (KRR) paradigms are strong in declarative KRR tasks, but are ill-equipped to learn from such experiences. In this work, we integrate logical-probabilistic KRR with model-based RL, enabling agents to simultaneously reason with declarative knowledge and learn from interaction experiences. The knowledge from humans and RL is unified and used for dynamically computing task-specific planning models under potentially new environments. Experiments were conducted using a mobile robot working on dialog, navigation, and delivery tasks. Results show significant improvements, in comparison to existing model-based RL methods.


World-class PyTorch support on Azure

#artificialintelligence

Today we are excited to strengthen our commitment to supporting PyTorch as a first-class framework on Azure, with exciting new capabilities in our Azure Machine Learning public preview refresh. In addition, our PyTorch support extends deeply across many of our AI Platform services and tooling, which we will highlight below. During the past two years since PyTorch's first release in October 2016, we've witnessed the rapid and organic adoption of the deep learning framework among academia, industry, and the AI community at large. While PyTorch's Python-first integration and imperative style have long made the framework a hit among researchers, the latest PyTorch 1.0 release brings the production-level readiness and scalability needed to make it a true end-to-end deep learning platform, from prototyping to production. Azure Machine Learning (Azure ML) service is a cloud-based service that enables data scientists to carry out end-to-end machine learning workflows, from data preparation and training to model management and deployment.


Discretizing Logged Interaction Data Biases Learning for Decision-Making

arXiv.org Machine Learning

Time series data that are not measured at regular intervals are commonly discretized as a preprocessing step. For example, data about customer arrival times might be simplified by summing the number of arrivals within hourly intervals, which produces a discrete-time time series that is easier to model. In this abstract, we show that discretization introduces a bias that affects models trained for decision-making. We refer to this phenomenon as discretization bias, and show that we can avoid it by using continuous-time models instead.


Bayes-CPACE: PAC Optimal Exploration in Continuous Space Bayes-Adaptive Markov Decision Processes

arXiv.org Machine Learning

We present the first PAC optimal algorithm for Bayes-Adaptive Markov Decision Processes (BAMDPs) in continuous state and action spaces, to the best of our knowledge. The BAMDP framework elegantly addresses model uncertainty by incorporating Bayesian belief updates into long-term expected return. However, computing an exact optimal Bayesian policy is intractable. Our key insight is to compute a near-optimal value function by covering the continuous state-belief-action space with a finite set of representative samples and exploiting the Lipschitz continuity of the value function. We prove the near-optimality of our algorithm and analyze a number of schemes that boost the algorithm's efficiency. Finally, we empirically validate our approach on a number of discrete and continuous BAMDPs and show that the learned policy has consistently competitive performance against baseline approaches.


Activity Recognition using Hierarchical Hidden Markov Models on Streaming Sensor Data

arXiv.org Machine Learning

Although each challenge in the field of recognition has great importance, the most important one refers to online activity recognition. The present study tries to use online hierarchical hidden Markov model to detect an activity on the stream of sensor data which can predict the activity in the environment with any sensor event. The activity recognition samples were labeled by the statistical features such as the duration of activity. The results of our proposed method test on two different datasets of smart homes in the real world showed that one dataset has improved 4% and reached (59%) while the results reached 64.6% for the other data by using the best methods.


Compositional planning in Markov decision processes: Temporal abstraction meets generalized logic composition

arXiv.org Artificial Intelligence

Abstract-- In hierarchical planning for Markov decision processes (MDPs), temporal abstraction allows planning with macro-actions that take place at different time scale in form of sequential composition. In this paper, we propose a novel approach to compositional reasoning and hierarchical planning for MDPs under temporal logic constraints. In addition to sequential composition, we introduce a composition of policies based on generalized logic composition: Given sub-policies for sub-tasks and a new task expressed as logic compositions of subtasks, a semi-optimal policy, which is optimal in planning with only sub-policies, can be obtained by simply composing sub-polices. Thus, a synthesis algorithm is developed to compute optimal policies efficiently by planning with primitive actions, policies for sub-tasks, and the compositions of sub-policies, for maximizing the probability of satisfying temporal logic specifications. We demonstrate the correctness and efficiency of the proposed method in stochastic planning examples with a single agent and multiple task specifications. I. INTRODUCTION Temporal logic is an expressive language to describe desired system properties: safety, reachability, obligation, stability, and liveness [18]. The algorithms for planning and probabilistic verification with temporal logic constraints have developed, with both centralized [2], [7], [17] and distributed methods [10]. Yet, there are two main barriers to practical applications: 1) The issue of scalability: In temporal logic constrained control problems, it is often necessary to introduce additional memory states for keeping track of the evolution of state variables with respect to these temporal logic constraints. The additional memory states grow exponentially (or double exponentially depending on the class of temporal logic) in the length of a specification [11] and make synthesis computational extensive.


Facebook's PyTorch plans to light the way to speedy workflows for Machine Learning • DEVCLASS

#artificialintelligence

Facebook's development department has finished a first release candidate for v1 of its PyTorch project – just in time for the first conference dedicated to the Python package. For those not familiar with the tool, its main features are NumPy-like tensor computation with GPU acceleration and a special deep neural network implementation. The preview contains a new set of compiler tools that at runtime rewrite PyTorch models to be more efficient. The just-in-time compiler should also be able to export models that are able to run in a C only runtime. Optimisation is optional and can be done either by tracing native Python code with torch.jit.trace or using a Python subset called Torch Script.