Weld, Daniel S.
Re-Active Learning: Active Learning with Relabeling
Lin, Christopher H. (University of Washington) | Mausam, M (Indian Institute of Technology, Delhi) | Weld, Daniel S. (University of Washington)
Active learning seeks to train the best classifier at the lowest annotation cost by intelligently picking the best examples to label. Traditional algorithms assume there is a single annotator and disregard the possibility of requesting additional independent annotations for a previously labeled example. However, relabeling examples is important, because all annotators make mistakes — especially crowdsourced workers, who have become a common source of training data. This paper seeks to understand the difference in marginal value between decreasing the noise of the training set via relabeling and increasing the size and diversity of the (noisier) training set by labeling new examples. We use the term re-active learning to denote this generalization of active learning. We show how traditional active learning methods perform poorly at re-active learning, present new algorithms designed for this important problem, formally characterize their behavior, and empirically show that our methods effectively make this tradeoff.
Parallel Task Routing for Crowdsourcing
Bragg, Jonathan (University of Washington) | Kolobov, Andrey (Microsoft Research) | Mausam, Mausam (Indian Institute of Technology, Delhi) | Weld, Daniel S. (University of Washington)
An ideal crowdsourcing or citizen-science system would route tasks to the most appropriate workers, but the best assignment is unclear because workers have varying skill, tasks have varying difficulty, and assigning several workers to a single task may significantly improve output quality. This paper defines a space of task routing problems, proves that even the simplest is NP-hard, and develops several approximation algorithms for parallel routing problems. We show that an intuitive class of requesters' utility functions is submodular, which lets us provide iterative methods for dynamically allocating batches of tasks that make near-optimal use of available workers in each round. Experiments with live oDesk workers show that our task routing algorithm uses only 48% of the human labor compared to the commonly used round-robin strategy. Further, we provide versions of our task routing algorithm which enable it to scale to large numbers of workers and questions and to handle workers with variable response times while still providing significant benefit over common baselines.
Towards a Language for Non-Expert Specification of POMDPs for Crowdsourcing
Lin, Christopher H. (University of Washington) | Mausam, - (University of Washington) | Weld, Daniel S. (University of Washington)
Crowdsourcing requesters are trapped between a rock and a hard place. Typically they specify their crowdsourcing workflows procedurally, but current languages commit them to overly strict and static policies that waste human effort. While optimizing workflows with more sophisticated tools like POMDPs can significantly reduce labor costs, such advanced AI techniques are hard to use and understand. We report on our progress in developing Clowder, a system that provides the user with an adaptive programming language that looks and feels like Lisp, yet abstracts over POMDPs so that non-experts can write POMDPs without knowing anything about them. Such a system frees requesters from needing to resort to suboptimal techniques that use approximate heuristics or hire a planning expert to formally define and solve their problems.
Crowdsourcing Multi-Label Classification for Taxonomy Creation
Bragg, Jonathan (University of Washington) | Mausam, - (University of Washington) | Weld, Daniel S. (University of Washington)
Recent work has introduced CASCADE, an algorithm for creating a globally-consistent taxonomy by crowdsourcing microwork from many individuals, each of whom may see only a tiny fraction of the data (Chilton et al. 2013). While CASCADE needs only unskilled labor and produces taxonomies whose quality approaches that of human experts, it uses significantly more labor than experts. This paper presents DELUGE, an improved workflow that produces taxonomies with comparable quality using significantly less crowd labor. Specifically, our method for crowdsourcing multi-label classification optimizes CASCADE’s most costly step (categorization) using less than 10% of the labor required by the original approach. DELUGE’s savings come from the use of decision theory and machine learning, which allow it to pose microtasks that aim to maximize information gain.
Fine-Grained Entity Recognition
Ling, Xiao (University of Washington) | Weld, Daniel S. (University of Washington)
Entity Recognition (ER) is a key component of relation extraction systems and many other natural-language processing applications. Unfortunately, most ER systems are restricted to produce labels from to a small set of entity classes, e.g., person, organization, location or miscellaneous. In order to intelligently understand text and extract a wide range of information, it is useful to more precisely determine the semantic classes of entities mentioned in unstructured text. This paper defines a fine-grained set of 112 tags, formulates the tagging problem as multi-class, multi-label classification, describes an unsupervised method for collecting training data, and presents the FIGER implementation. Experiments show that the system accurately predicts the tags for entities. Moreover, it provides useful information for a relation extraction system, increasing the F1 score by 93%. We make FIGER and its data available as a resource for future work.
LRTDP Versus UCT for Online Probabilistic Planning
Kolobov, Andrey (University of Washington, Seattle) | Mausam, . (University of Washington, Seattle) | Weld, Daniel S. (University of Washington, Seattle)
UCT, the premier method for solving games such as Go, is also becoming the dominant algorithm for probabilistic planning. Out of the five solvers at the International Probabilistic Planning Competition (IPPC) 2011, four were based on the UCT algorithm. However, while a UCT-based planner, PROST, won the contest, an LRTDP-based system, Glutton, came in a close second, outperforming other systems derived from UCT. These results raise a question: what are the strengths and weaknesses of LRTDP and UCT in practice? This paper starts answering this question by contrasting the two approaches in the context of finite-horizon MDPs. We demonstrate that in such scenarios, UCT's lack of a sound termination condition is a serious practical disadvantage. In order to handle an MDP with a large finite horizon under a time constraint, UCT forces an expert to guess a non-myopic lookahead value for which it should be able to converge on the encountered states. Mistakes in setting this parameter can greatly hurt UCT's performance. In contrast, LRTDP's convergence criterion allows for an iterative deepening strategy. Using this strategy, LRTDP automatically finds the largest lookahead value feasible under the given time constraint. As a result, LRTDP has better performance and stronger theoretical properties. We present an online version of Glutton, named Gourmand, that illustrates this analysis and outperforms PROST on the set of IPPC-2011 problems.
Dynamically Switching between Synergistic Workflows for Crowdsourcing
Lin, Christopher H. (University of Washington) | Mausam, Mausam (University of Washington) | Weld, Daniel S. (University of Washington)
To ensure quality results from unreliable crowdsourced workers, task designers often construct complex workflows and aggregate worker responses from redundant runs. Frequently, they experiment with several alternative workflows to accomplish the task, and eventually deploy the one that achieves the best performance during early trials. Surprisingly, this seemingly natural design paradigm does not achieve the full potential of crowdsourcing. In particular, using a single workflow (even the best) to accomplish a task is suboptimal. We show that alternative workflows can compose synergistically to yield much higher quality output. We formalize the insight with a novel probabilistic graphical model. Based on this model, we design and implement AGENTHUNT, a POMDP-based controller that dynamically switches between these workflows to achieve higher returns on investment. Additionally, we design offline and online methods for learning model parameters. Live experiments on Amazon Mechanical Turk demonstrate the superiority of AGENTHUNT for the task of generating NLP training data, yielding up to 50% error reduction and greater net utility compared to previous methods.
Reverse Iterative Deepening for Finite-Horizon MDPs with Large Branching Factors
Kolobov, Andrey (University of Washington, Seattle) | Dai, Peng (Google Inc.) | Mausam, Mausam (University of Washington, Seattle) | Weld, Daniel S. (University of Washington, Seattle)
In contrast to previous competitions, where the problems were goal-based, the 2011 International Probabilistic Planning Competition (IPPC-2011) emphasized finite-horizon reward maximization problems with large branching factors. These MDPs modeled more realistic planning scenarios and presented challenges to the previous state-of-the-art planners (e.g., those from IPPC-2008), which were primarily based on domain determinization — a technique more suited to goal-oriented MDPs with small branching factors. Moreover, large branching factors render the existing implementations of RTDP- and LAO-style algorithms inefficient as well. In this paper we present GLUTTON, our planner at IPPC-2011 that performed well on these challenging MDPs. The main algorithm used by GLUTTON is LR2TDP, an LRTDP-based optimal algorithm for finite-horizon problems centered around the novel idea of reverse iterative deepening. We detail LR2TDP itself as well as a series of optimizations included in GLUTTON that help LR2TDP achieve competitive performance on difficult problems with large branching factors -- subsampling the transition function, separating out natural dynamics, caching transition function samples, and others. Experiments show that GLUTTON and PROST, the IPPC-2011 winner, have complementary strengths, with GLUTTON demonstrating superior performance on problems with few high-reward terminal states.
Human Intelligence Needs Artificial Intelligence
Weld, Daniel S. (University of Washington) | Mausam, . (University of Washington) | Dai, Peng (University of Washington)
Crowdsourcing platforms, such as Amazon Mechanical Turk, have enabled the construction of scalable applications for tasks ranging from product categorization and photo tagging to audio transcription and translation. These vertical applications are typically realized with complex, self-managing workflows that guarantee quality results. But constructing such workflows is challenging, with a huge number of alternative decisions for the designer to consider. We argue the thesis that “Artificial intelligence methods can greatly simplify the process of creating and managing complex crowdsourced workflows.” We present the design of CLOWDER, which uses machine learning to continually refine models of worker performance and task difficulty. Using these models, CLOWDER uses decision-theoretic optimization to 1) choose between alternative workflows, 2) optimize parameters for a workflow, 3) create personalized interfaces for individual workers, and 4) dynamically control the workflow. Preliminary experience suggests that these optimized workflows are significantly more economical (and return higher quality output) than those generated by humans.
Heuristic Search for Generalized Stochastic Shortest Path MDPs
Kolobov, Andrey (University of Washington, Seattle) | Mausam, Mausam (University of Washington, Seattle) | Weld, Daniel S. (University of Washington, Seattle) | Geffner, Hector (ICREA and Universitat Pompeu Fabra)
Research in efficient methods for solving infinite-horizon MDPs has so far concentrated primarily on discounted MDPs and the more general stochastic shortest path problems (SSPs). These are MDPs with 1) an optimal value function V* that is the unique solution of Bellman equation and 2) optimal policies that are the greedy policies w.r.t. V*. This paper’s main contribution is the description of a new class of MDPs, that have well-defined optimal solutions that do not comply with either 1 or 2 above. We call our new class Generalized Stochastic Shortest Path (GSSP) problems. GSSP allows more general reward structure than SSP and subsumes several established MDP types including SSP, positive-bounded, negative, and discounted-reward models. While existing efficient heuristic search algorithms like LAO* and LRTDP are not guaranteed to converge to the optimal value function for GSSPs, we present a new heuristic-search-based family of algorithms, FRET (Find, Revise, Eliminate Traps). A preliminary empirical evaluation shows that FRET solves GSSPs much more efficiently than Value Iteration.