Goto

Collaborating Authors

 Industry


Learning to Avoid Collisions

AAAI Conferences

Members of a multi-robot team, operating within close quarters, need to avoid crashing into each other. Simple collision avoidance methods can be used to prevent such collisions, typically by computing the distance to other robots and stopping, perhaps moving away, when this distance falls below a certain threshold. While this approach may avoid disaster, it may also reduce the team's efficiency if robots halt for a long time to let others pass by or if they travel further to move around one another. This paper reports on experiments where a human operator, through a graphical user interface, watches robots perform an exploration task. The operator can manually suspend robots' movements before they crash into each other, and then resume their movements when their paths are clear. Experiment logs record the robots' states when they are paused and resumed. A behavior pattern for collision avoidance is learned, by classifying the states of the robots' environment when the human operator issues "wait" and "resume" commands. Preliminary results indicate that it is possible to learn a classifier which models these behavior patterns, and that different human operators consider different factors when making decisions about stopping and starting robots.


Between Instruction and Reward: Human-Prompted Switching

AAAI Conferences

Intelligent systems promise to amplify, augment, and extend innate human abilities. A principal example is that of assistive rehabilitation robots---artificial intelligence and machine learning enable new electromechanical systems that restore biological functions lost through injury or illness. In order for an intelligent machine to assist a human user, it must be possible for a human to communicate their intentions and preferences to their non-human counterpart. While there are a number of techniques that a human can use to direct a machine learning system, most research to date has focused on the contrasting strategies of instruction and reward. The primary contribution of our work is to demonstrate that the middle ground between instruction and reward is a fertile space for research and immediate technological progress. To support this idea, we introduce the setting of human-prompted switching, and illustrate the successful combination of switching with interactive learning using a concrete real-world example: human control of a multi-joint robot arm. We believe techniques that fall between the domains of instruction and reward are complementary to existing approaches, and will open up new lines of rapid progress for interactive human training of machine learning systems.


Training Wheels for the Robot: Learning from Demonstration Using Simulation

AAAI Conferences

Learning from demonstration (LfD) is a promising technique for instructing/teaching autonomous systems based on demonstrations from people who may have little to no experience with robots. An important aspect to LfD is the communication method used to transfer knowledge from an instructor to a robot. The communication method affects the complexity of the demonstration process for instructors, the range of tasks a robot can learn, and the learning algorithm itself. We have designed a graphical interface and an instructional language to provide an intuitive teaching system. The drawback to simplifying the teaching interface is that the resulting demonstration data are less structured, adding complexity to the learning process. This additional complexity is handled through the combination of a minimal set of predefined behaviors and a task representation capable of learning probabilistic policies over a set of behaviors. The predefined behaviors consist of finite actions a robot can perform, which act as building blocks for more complex tasks.


Active Imitation Learning via Reduction to I.I.D. Active Learning

AAAI Conferences

In standard passive imitation learning, the goal is to learn an expert’s policy by passively observing full execution trajectories of it. Unfortunately, generating such trajectories can require substantial expert effort and be impractical in some cases. In this paper, we consider Active Imitation Learning (AIL) with the goal of reducing this effort by querying the expert about the desired action at individual states, which are selected based on answers to past queries and the learner’s interactions with an environment simulator. Our new approach is based on reducing AIL to i.i.d. active learning, which can leverage progress in the i.i.d. setting. We introduce and analyze reductions for both non-stationary and stationary policies, showing that the label complexity (number of queries) of AIL can be substantially less than passive learning. We also introduce a practical algorithm inspired by the reductions, which is shown to be highly effective in four test domains compared to a number of alternatives.


Novel Interaction Strategies for Learning from Teleoperation

AAAI Conferences

The field of robot Learning from Demonstration (LfD) makes use of several input modalities for demonstrations (teleoperation, kinesthetic teaching, marker- and vision-based motion tracking). In this paper we present two experiments aimed at identifying and overcoming challenges associated with using teleoperation as an input modality for LfD. Our first experiment compares kinesthetic teaching and teleoperation and highlights some inherent problems associated with teleoperation; specifically uncomfortable user interactions and inaccurate robot demonstrations. Our second experiment is focused on overcoming these problems and designing the teleoperation interaction to be more suitable for LfD. In previous work we have proposed a novel demonstration strategy using the concept of keyframes, where demonstrations are in the form of a discrete set of robot configurations. Keyframes can be naturally combined with continuous trajectory demonstrations to generate a hybrid strategy. We perform user studies to evaluate each of these demonstration strategies individually and show that keyframes are intuitive to the users and are particularly useful in providing noise-free demonstrations. We find that users prefer the hybrid strategy best for demonstrating tasks to a robot by teleoperation.


Learned Partial Automation for Shared Control in Tele-Robotic Manipulation

AAAI Conferences

When used in challenging applications like surgery or underwater maintenance, the use of tele-operated robots involves manipulations that are complex to perform on the master controllers due to restricted access and limited perception. In this paper, we investigate an assistance approach for tele-robotic manipulation, in which the robot automates several degrees of freedom (DOF) of the tools, such as their orientation. This automation requires the understanding of the intent of the operator, so as to not impede the natural manipulation of the remaining DOF. Our system is therefore based on the observation that in the aforementioned applications, the manipulation tasks have often a structure that can be learned from the daily usage of the robot. We propose an approach that uses the typical motion performed by the operator during a given task, learned from demonstration, to automate the rotation of the manipulator in new instances of this task. The operator keeps control of the robot by manipulating the tool translation and can recover full control if needed. The learned motion model is based on Gaussian Mixture Regressions and combined with a 3D reconstruction of the environment to address variations in the task. We demonstrate our assistance approach using a da Vinci robot on a task consisting of moving a ring along a wire possessing a complex 3D shape.



Generalized Weighted Model Counting: An Efficient Monte-Carlo Meta-Algorithm

AAAI Conferences

In this paper, we focus on computing the prices of secu- rities represented by logical formulas in combinatorial prediction markets when the price function is represented by a Bayesian network. This problem turns out to be a natural extension of the weighted model counting (WMC) problem (Sang, Bearne, and Kautz 2005), which we call generalized weighted model counting (GWMC) problem. In GWMC, we are given a logical formula F and a polynomial-time computable weight function. We are asked to compute the total weight of the valuations that satisfy F. Based on importance sampling, we propose a Monte-Carlo meta-algorithm that has a good theoretical guarantee for formulas in disjunctive normal form (DNF). The meta-algorithm queries an oracle algorithm that computes marginal probabilities in Bayesian networks, and has the following theoretical guarantee. When the weight function can be approximately represented by a Bayesian network for which the oracle algorithm runs in polynomial time, our meta-algorithm becomes a fully polynomial-time randomized approximation scheme (FPRAS).


Cluster-Weighted Aggregation

AAAI Conferences

We are interested in aggregating forecasts of multinomial problems elicited from multiple experts. A common approach is to assign a weight to each expert, then form a weighted sum over their forecasts. Theoretical studies suggest that an important factor in such weighting is the diversity among experts. However, diversity is intrinsically a pairwise measure over experts, and does not lend itself naturally to a single weight that can be applied to an expert’s forecast in a weighted average. We suggest a way to take advantage of such pairwise measures in aggregating forecasts.


The Good Judgment Project: A Large Scale Test of Different Methods of Combining Expert Predictions

AAAI Conferences

Many methods have been proposed for making use of multiple experts to predict uncertain events such as election outcomes, ranging from simple averaging of individual predictions to complex collaborative structures such as prediction markets or structured group decision making processes. We used a panel of more than 2,000 forecasters to systematically compare the performance of four different collaborative processes on a battery of political prediction problems. We found that teams and prediction markets systematically outperformed averages of individual forecasters, that training forecasters helps, and that the exact form of how predictions are combined has a large effect on overall prediction accuracy.