Plotting

 Country


Causal Inference on Time Series using Structural Equation Models

arXiv.org Machine Learning

Causal inference uses observations to infer the causal structure of the data generating system. We study a class of functional models that we call Time Series Models with Independent Noise (TiMINo). These models require independent residual time series, whereas traditional methods like Granger causality exploit the variance of residuals. There are two main contributions: (1) Theoretical: By restricting the model class (e.g. to additive noise) we can provide a more general identifiability result than existing ones. This result incorporates lagged and instantaneous effects that can be nonlinear and do not need to be faithful, and non-instantaneous feedbacks between the time series. (2) Practical: If there are no feedback loops between time series, we propose an algorithm based on non-linear independence tests of time series. When the data are causally insufficient, or the data generating process does not satisfy the model assumptions, this algorithm may still give partial results, but mostly avoids incorrect answers. An extension to (non-instantaneous) feedbacks is possible, but not discussed. It outperforms existing methods on artificial and real data. Code can be provided upon request.


Fine-Grained Entity Recognition

AAAI Conferences

Entity Recognition (ER) is a key component of relation extraction systems and many other natural-language processing applications. Unfortunately, most ER systems are restricted to produce labels from to a small set of entity classes, e.g., person, organization, location or miscellaneous. In order to intelligently understand text and extract a wide range of information, it is useful to more precisely determine the semantic classes of entities mentioned in unstructured text. This paper defines a fine-grained set of 112 tags, formulates the tagging problem as multi-class, multi-label classification, describes an unsupervised method for collecting training data, and presents the FIGER implementation. Experiments show that the system accurately predicts the tags for entities. Moreover, it provides useful information for a relation extraction system, increasing the F1 score by 93%. We make FIGER and its data available as a resource for future work.


Towards Social Norm Design for Crowdsourcing Markets

AAAI Conferences

Crowdsourcing markets, such as Amazon Mechanical Turk, provide a platform for matching prospective workers around the world with tasks. However, they are often plagued by workers who attempt to exert as little effort as possible, and requesters who deny workers payment for their labor. For crowdsourcing markets to succeed, it is essential to discourage such behavior. With this in mind, we propose a framework for the design and analysis of incentive mechanisms based on social norms, which consist of a set of rules that participants are expected to follow, and a mechanism for updating participants’ public reputations based on whether or not they do. We start by considering the most basic version of our model, which contains only homogeneous participants and randomly matches workers with tasks. The optimal social norm in this setting turns out to be a simple, easily comprehensible incentive mechanism in which market participants are encouraged to play a tit-for-tat-like strategy. This simple mechanism is optimal even when the set of market participants changes dynamically over time, or when some fraction of the participants may be irrational. In addition to the basic model, we demonstrate how this framework can be applied to situations in which there are heterogeneous users by giving several illustrating examples. This work is a first step towards a complete theory of incentive design for crowdsourcing systems. We hope to build upon this framework and explore more interesting and practical aspects of real online labor markets in our future work.


Sequential Decision Making with Rank Dependent Utility: A Minimax Regret Approach

AAAI Conferences

This paper is devoted to sequential decision making with Rank Dependent expected Utility (RDU). This decision criterion generalizes Expected Utility and enables to model a wider range of observed (rational) behaviors. In such a sequential decision setting, two conflicting objectives can be identified in the assessment of a strategy: maximizing the performance viewed from the initial state (optimality), and minimizing the incentive to deviate during implementation (deviation-proofness). In this paper, we propose a minimax regret approach taking these two aspects into account, and we provide a search procedure to determine an optimal strategy for this model. Numerical results are presented to show the interest of the proposed approach in terms of optimality, deviation-proofness and computability.


Social State Recognition and Knowledge-Level Planning for Human-Robot Interaction in a Bartender Domain

AAAI Conferences

We discuss preliminary work focusing on the problem of combining social interaction with task-based action in a dynamic, multiagent bartending domain, using an embodied robot. We show how the users' spoken input is interpreted, discuss how social states are inferred from the parsed speech together with low-level information from the vision system, and present a planning approach that models task, dialogue, and social actions in a simple bartending scenario. This approach allows us to build interesting plans, which have been evaluated in a real-world study, using a general purpose, off-the-shelf planner, as an alternative to more mainstream methods of interaction management.


Random Projection with Filtering for Nearly Duplicate Search

AAAI Conferences

High dimensional nearest neighbor search is a fundamental problem and has found applications in many domains. Although many hashing based approaches have been proposed for approximate nearest neighbor search in high dimensional space, one main drawback is that they often return many false positives that need to be filtered out by a post procedure. We propose a novel method to address this limitation in this paper. The key idea is to introduce a filtering procedure within the search algorithm, based on the compressed sensing theory, that effectively removes the false positive answers. We first obtain a sparse representation for each data point by the landmark based approach, after which we solve the nearly duplicate search that the difference between the query and its nearest neighbors forms a sparse vector living in a small ℓp ball, where p ≤ 1. Our empirical study on real-world datasets demonstrates the effectiveness of the proposed approach compared to the state-of-the-art hashing methods.


Combining Hashing and Abstraction in Sparse High Dimensional Feature Spaces

AAAI Conferences

With the exponential increase in the number of documents available online, e.g., news articles, weblogs, scientific documents, the development of effective and efficient classification methods is needed. The performance of document classifiers critically depends, among other things, on the choice of the feature representation. The commonly used "bag of words" and n-gram representations can result in prohibitively high dimensional input spaces. Data mining algorithms applied to these input spaces may be intractable due to the large number of dimensions. Thus, dimensionality reduction algorithms that can process data into features fast at runtime, ideally in constant time per feature, are greatly needed in high throughput applications, where the number of features and data points can be in the order of millions. One promising line of research to dimensionality reduction is feature clustering. We propose to combine two types of feature clustering, namely hashing and abstraction based on hierarchical agglomerative clustering, in order to take advantage of the strengths of both techniques. Experimental results on two text data sets show that the combined approach uses significantly smaller number of features and gives similar performance when compared with the "bag of words" and n-gram approaches.


MAXSAT Heuristics for Cost Optimal Planning

AAAI Conferences

The cost of an optimal delete relaxed plan, known as h+, is a powerful admissible heuristic but is in general intractable to compute. In this paper we examine the problem of computing h+ by encoding it as a MAXSAT problem. We develop a new encoding that utilizes constraint generation to support the computation of a sequence of increasing lower bounds on h+. We show a close connection between the computations performed by a recent approach for solving MAXSAT and a hitting set approach recently proposed for computing h+. Using this connection we observe that our MAXSAT computation can be initialized with a set of landmarks computed by LM-cut. By judicious use of MAXSAT solving along with a technique of lazy heuristic evaluation we obtain speedups for finding optimal plans over LM-cut on a number of domains. Our approach enables the exploitation of continued progress in MAXSAT solving, and also makes it possible to consider computing or approximating heuristics that are even more informed that h+ by, for example, adding some information about deletes back into the encoding.


Dynamic Matching via Weighted Myopia with Application to Kidney Exchange

AAAI Conferences

In many dynamic matching applications — especially high-stakes ones — the competitive ratios of prior-free online algorithms are unacceptably poor. The algorithm should take distributional information about possible futures into account in deciding what action to take now. This is typically done by drawing sample trajectories of possible futures at each time period, but may require a prohibitively large number of trajectories or prohibitive memory and/or computation to decide what action to take. Instead, we propose to learn potentials of elements (e.g., vertices) of the current problem. Then, at run time, we simply run an offline matching algorithm at each time period, but subtracting out in the objective the potentials of the elements used up in the matching. We apply the approach to kidney exchange. Kidney exchanges enable willing but incompatible patient-donor pairs (vertices) to swap donors. These swaps typically include cycles longer than two pairs and chains triggered by altruistic donors. Fielded exchanges currently match myopically, maximizing the number of patients who get kidneys in an offline fashion at each time period. Myopic matching is sub-optimal; the clearing problem is dynamic since patients, donors, and altruists appear and expire over time. We theoretically compare the power of using potentials on increasingly large elements: vertices, edges, cycles, and the entire graph (optimum). Then, experiments show that by learning vertex potentials, our algorithm matches more patients than the current practice of clearing myopically. It scales to exchanges orders of magnitude beyond those handled by the prior dynamic algorithm.


Covering Number as a Complexity Measure for POMDP Planning and Learning

AAAI Conferences

Finding a meaningful way of characterizing the difficulty of partially observable Markov decision processes (POMDPs) is a core theoretical problem in POMDP research. State-space size is often used as a proxy for POMDP difficulty, but it is a weak metric at best. Existing work has shown that the covering number for the reachable belief space, which is a set of belief points that are reachable from the initial belief point, has interesting links with the complexity of POMDP planning, theoretically. In this paper, we present empirical evidence that the covering number for the reachable belief space (or just ``covering number", for brevity) is a far better complexity measure than the state-space size for both planning and learning POMDPs on several small-scale benchmark problems. We connect the covering number to the complexity of learning POMDPs by proposing a provably convergent learning algorithm for POMDPs without reset given knowledge of the covering number.