Mausam, .
Numerical Relation Extraction with Minimal Supervision
Madaan, Aman (Visa Inc.) | Mittal, Ashish (IBM Research) | Mausam, . (Indian Institute of Technology Delhi) | Ramakrishnan, Ganesh (Indian Institute of Technology Bombay) | Sarawagi, Sunita (Indian Institute of Technology Bombay)
We study a novel task of numerical relation extraction with the goal of extracting relations where one of the arguments is a number or a quantity ( e.g., atomic_number(Aluminium, 13), inflation_rate(India, 10.9%)). This task presents peculiar challenges not found in standard IE, such as the difficulty of matching numbers in distant supervision and the importance of units. We design two extraction systems that require minimal human supervision per relation: (1) NumberRule, a rule based extractor, and (2) NumberTron, a probabilistic graphical model. We find that both systems dramatically outperform MultiR, a state-of-the-art non-numerical IE model, obtaining up to 25 points F-score improvement.
To Re(label), or Not To Re(label)
Lin, Christopher H. (University of Washington) | Mausam, . (Indian Institute of Technology, Delhi) | Weld, Daniel S (University of Washington)
One of the most popular uses of crowdsourcing is to provide training data for supervised machine learning algorithms. Since human annotators often make errors, requesters commonly ask multiple workers to label each example. ย But is this strategy always the most cost effective use of crowdsourced workers? We argue "No" --- often classifiers can achieve higher accuracies when trained with noisy "unilabeled" data. However, in some cases relabeling is extremely important. ย We discuss three factors that may make relabeling an effective strategy: classifier expressiveness, worker accuracy, and budget.
LRTDP Versus UCT for Online Probabilistic Planning
Kolobov, Andrey (University of Washington, Seattle) | Mausam, . (University of Washington, Seattle) | Weld, Daniel S. (University of Washington, Seattle)
UCT, the premier method for solving games such as Go, is also becoming the dominant algorithm for probabilistic planning. Out of the five solvers at the International Probabilistic Planning Competition (IPPC) 2011, four were based on the UCT algorithm. However, while a UCT-based planner, PROST, won the contest, an LRTDP-based system, Glutton, came in a close second, outperforming other systems derived from UCT. These results raise a question: what are the strengths and weaknesses of LRTDP and UCT in practice? This paper starts answering this question by contrasting the two approaches in the context of finite-horizon MDPs. We demonstrate that in such scenarios, UCT's lack of a sound termination condition is a serious practical disadvantage. In order to handle an MDP with a large finite horizon under a time constraint, UCT forces an expert to guess a non-myopic lookahead value for which it should be able to converge on the encountered states. Mistakes in setting this parameter can greatly hurt UCT's performance. In contrast, LRTDP's convergence criterion allows for an iterative deepening strategy. Using this strategy, LRTDP automatically finds the largest lookahead value feasible under the given time constraint. As a result, LRTDP has better performance and stronger theoretical properties. We present an online version of Glutton, named Gourmand, that illustrates this analysis and outperforms PROST on the set of IPPC-2011 problems.
Human Intelligence Needs Artificial Intelligence
Weld, Daniel S. (University of Washington) | Mausam, . (University of Washington) | Dai, Peng (University of Washington)
Crowdsourcing platforms, such as Amazon Mechanical Turk, have enabled the construction of scalable applications for tasks ranging from product categorization and photo tagging to audio transcription and translation. These vertical applications are typically realized with complex, self-managing workflows that guarantee quality results. But constructing such workflows is challenging, with a huge number of alternative decisions for the designer to consider. We argue the thesis that โArtificial intelligence methods can greatly simplify the process of creating and managing complex crowdsourced workflows.โ We present the design of CLOWDER, which uses machine learning to continually refine models of worker performance and task difficulty. Using these models, CLOWDER uses decision-theoretic optimization to 1) choose between alternative workflows, 2) optimize parameters for a workflow, 3) create personalized interfaces for individual workers, and 4) dynamically control the workflow. Preliminary experience suggests that these optimized workflows are significantly more economical (and return higher quality output) than those generated by humans.
Artificial Intelligence for Artificial Artificial Intelligence
Dai, Peng (University of Washington) | Mausam, . (University of Washington) | Weld, Daniel Sabby (University of Washington)
Crowdsourcing platforms such as Amazon Mechanical Turk have become popular for a wide variety of human intelligence tasks; however, quality control continues to be a significant challenge. Recently, we propose TurKontrol, a theoretical model based on POMDPs to optimize iterative, crowd-sourced workflows. However, they neither describe how to learn the model parameters, nor show its effectiveness in a real crowd-sourced setting. Learning is challenging due to the scale of the model and noisy data: there are hundreds of thousands of workers with high-variance abilities. This paper presents an end-to-end system that first learns TurKontrol's POMDP parameters from real Mechanical Turk data, and then applies the model to dynamically optimize live tasks. We validate the model and use it to control a successive-improvement process on Mechanical Turk. By modeling worker accuracy and voting patterns, our system produces significantly superior artifacts compared to those generated through nonadaptive workflows using the same amount of money.
Commonsense from the Web: Relation Properties
Lin, Thomas (University of Washington) | Mausam, . (University of Washington) | Etzioni, Oren (University of Washington)
When general purpose software agents fail, it's often because they're brittle and need more background commonsense knowledge. In this paper we present relation properties as a valuable type of commonsense knowledge that can be automatically inferred at scale by reading the Web. People base many commonsense inferences on their knowledge of relation properties such as functionality, transitivity, and others. For example, all people know that bornIn(Year) satisfies the functionality property, meaning that each person can be born in exactly one year. Thus inferences like "Obama was born in 1961, so he was not born in 2008", which computers do not know, are obvious even to children. We demonstrate scalable heuristics for learning relation functionality from noisy Web text that outperform existing approaches to detecting functionality. The heuristics we use address Web NLP challenges that are also common to learning other relation properties, and can be easily transferred. Each relation property we learn for a Web-scale set of relations will enable computers to solve real tasks, and the data from learning many such properties will be a useful addition to general commonsense knowledge bases.