Goto

Collaborating Authors

 Genre


Predicting Behavior in Unstructured Bargaining with a Probability Distribution

Journal of Artificial Intelligence Research

In experimental tests of human behavior in unstructured bargaining games, typically many joint utility outcomes are found to occur, not just one. This suggests we predict the outcome of such a game as a probability distribution. This is in contrast to what is conventionally done (e.g, in the Nash bargaining solution), which is predict a single outcome. We show how to translate Nash's bargaining axioms to provide a distribution over outcomes rather than a single outcome. We then prove that a subset of those axioms forces the distribution over utility outcomes to be a power-law distribution. Unlike Nash's original result, our result holds even if the feasible set is finite. When the feasible set is convex and comprehensive, the mode of the power law distribution is the Harsanyi bargaining solution, and if we require symmetry it is the Nash bargaining solution. However, in general these modes of the joint utility distribution are not the experimentalist's Bayes-optimal predictions for the joint utility. Nor are the bargains corresponding to the modes of those joint utility distributions the modes of the distribution over bargains in general, since more than one bargain may result in the same joint utility. After introducing distributional bargaining solution concepts, we show how an external regulator can use them to optimally design an unstructured bargaining scenario. Throughout we demonstrate our analysis in computational experiments involving flight rerouting negotiations in the National Airspace System. We emphasize that while our results are formulated for unstructured bargaining, they can also be used to make predictions for noncooperative games where the modeler knows the utility functions of the players over possible outcomes of the game, but does not know the move spaces the players use to determine those outcomes.


The PAV algorithm optimizes binary proper scoring rules

arXiv.org Machine Learning

There has been much recent interest in application of the pool-adjacent-violators (PAV) algorithm for the purpose of calibrating the probabilistic outputs of automatic pattern recognition and machine learning algorithms. Special cost functions, known as proper scoring rules form natural objective functions to judge the goodness of such calibration. We show that for binary pattern classifiers, the non-parametric optimization of calibration, subject to a monotonicity constraint, can be solved by PAV and that this solution is optimal for all regular binary proper scoring rules. This extends previous results which were limited to convex binary proper scoring rules. We further show that this result holds not only for calibration of probabilities, but also for calibration of log-likelihood-ratios, in which case optimality holds independently of the prior probabilities of the pattern classes.


Relevance As a Metric for Evaluating Machine Learning Algorithms

arXiv.org Machine Learning

In machine learning, the choice of a learning algorithm that is suitable for the application domain is critical. The performance metric used to compare different algorithms must also reflect the concerns of users in the application domain under consideration. In this work, we propose a novel probability-based performance metric called Relevance Score for evaluating supervised learning algorithms. We evaluate the proposed metric through empirical analysis on a dataset gathered from an intelligent lighting pilot installation. In comparison to the commonly used Classification Accuracy metric, the Relevance Score proves to be more appropriate for a certain class of applications.


Parsimonious module inference in large networks

arXiv.org Machine Learning

We investigate the detectability of modules in large networks when the number of modules is not known in advance. We employ the minimum description length (MDL) principle which seeks to minimize the total amount of information required to describe the network, and avoid overfitting. According to this criterion, we obtain general bounds on the detectability of any prescribed block structure, given the number of nodes and edges in the sampled network. We also obtain that the maximum number of detectable blocks scales as $\sqrt{N}$, where $N$ is the number of nodes in the network, for a fixed average degree $$. We also show that the simplicity of the MDL approach yields an efficient multilevel Monte Carlo inference algorithm with a complexity of $O(\tau N\log N)$, if the number of blocks is unknown, and $O(\tau N)$ if it is known, where $\tau$ is the mixing time of the Markov chain. We illustrate the application of the method on a large network of actors and films with over $10^6$ edges, and a dissortative, bipartite block structure.


ClusterCluster: Parallel Markov Chain Monte Carlo for Dirichlet Process Mixtures

arXiv.org Machine Learning

CLUSTERCLUSTER: PARALLEL MARKOV CHAIN MONTE CARLO FOR DIRICHLET PROCESS MIXTURES By Dan Lovell, Jonathan Malmaud, Ryan P. Adams and Vikash K. Mansinghka Massachusetts Institute of Technology and Harvard University The Dirichlet process (DP) is a fundamental mathematical tool for Bayesian nonparametric modeling, and is widely used in tasks such as density estimation, natural language processing, and time series modeling. Although MCMC inference methods for the DP often provide a gold standard in terms asymptotic accuracy, they can be computationally expensive and are not obviously parallelizable. We propose a reparameterization of the Dirichlet process that induces conditional independencies between the atoms that form the random measure. This conditional independence enables many of the Markov chain transition operators for DP inference to be simulated in parallel across multiple cores. Applied to mixture modeling, our approach enables the Dirichlet process to simultaneously learn clusters that describe the data and superclusters that define the granularity of parallelization. Unlike previous approaches, our technique does not require alteration of the model and leaves the true posterior distribution invariant. It also naturally lends itself to a distributed software implementation in terms of Map-Reduce, which we test in cluster configurations of over 50 machines and 100 cores.


A powerful and efficient set test for genetic markers that handles confounders

arXiv.org Machine Learning

Approaches for testing sets of variants, such as a set of rare or common variants within a gene or pathway, for association with complex traits are important. In particular, set tests allow for aggregation of weak signal within a set, can capture interplay among variants, and reduce the burden of multiple hypothesis testing. Until now, these approaches did not address confounding by family relatedness and population structure, a problem that is becoming more important as larger data sets are used to increase power. Results: We introduce a new approach for set tests that handles confounders. Our model is based on the linear mixed model and uses two random effects-one to capture the set association signal and one to capture confounders. We also introduce a computational speedup for two-random-effects models that makes this approach feasible even for extremely large cohorts. Using this model with both the likelihood ratio test and score test, we find that the former yields more power while controlling type I error. Application of our approach to richly structured GAW14 data demonstrates that our method successfully corrects for population structure and family relatedness, while application of our method to a 15,000 individual Crohn's disease case-control cohort demonstrates that it additionally recovers genes not recoverable by univariate analysis. Availability: A Python-based library implementing our approach is available at http://mscompbio.codeplex.com


Computing Datalog Rewritings beyond Horn Ontologies

arXiv.org Artificial Intelligence

Rewriting-based approaches for answering queries over an OWL 2 DL ontology have so far been developed mainly for Horn fragments of OWL 2 DL. In this paper, we study the possibilities of answering queries over non-Horn ontologies using datalog rewritings. We prove that this is impossible in general even for very simple ontology languages, and even if PTIME = NP. Furthermore, we present a resolution-based procedure for $\SHI$ ontologies that, in case it terminates, produces a datalog rewriting of the ontology. Our procedure necessarily terminates on DL-Lite_{bool}^H ontologies---an extension of OWL 2 QL with transitive roles and Boolean connectives.


Joint-ViVo: Selecting and Weighting Visual Words Jointly for Bag-of-Features based Tissue Classification in Medical Images

arXiv.org Machine Learning

Automatically classifying the tissues types of Region of Interest (ROI) in medical imaging has been an important application in Computer-Aided Diagnosis (CAD), such as classification of breast parenchymal tissue in the mammogram, classify lung disease patterns in High-Resolution Computed Tomography (HRCT) etc. Recently, bag-of-features method has shown its power in this field, treating each ROI as a set of local features. In this paper, we investigate using the bag-of-features strategy to classify the tissue types in medical imaging applications. Two important issues are considered here: the visual vocabulary learning and weighting. Although there are already plenty of algorithms to deal with them, all of them treat them independently, namely, the vocabulary learned first and then the histogram weighted. Inspired by Auto-Context who learns the features and classifier jointly, we try to develop a novel algorithm that learns the vocabulary and weights jointly. The new algorithm, called Joint-ViVo, works in an iterative way. In each iteration, we first learn the weights for each visual word by maximizing the margin of ROI triplets, and then select the most discriminate visual words based on the learned weights for the next iteration. We test our algorithm on three tissue classification tasks: identifying brain tissue type in magnetic resonance imaging (MRI), classifying lung tissue in HRCT images, and classifying breast tissue density in mammograms. The results show that Joint-ViVo can perform effectively for classifying tissues.


Probability Aggregates in Probability Answer Set Programming

arXiv.org Artificial Intelligence

Probability answer set programming is a declarative programming that has been shown effective for representing and reasoning about a variety of probability reasoning tasks. However, the lack of probability aggregates, e.g. {\em expected values}, in the language of disjunctive hybrid probability logic programs (DHPP) disallows the natural and concise representation of many interesting problems. In this paper, we extend DHPP to allow arbitrary probability aggregates. We introduce two types of probability aggregates; a type that computes the expected value of a classical aggregate, e.g., the expected value of the minimum, and a type that computes the probability of a classical aggregate, e.g, the probability of sum of values. In addition, we define a probability answer set semantics for DHPP with arbitrary probability aggregates including monotone, antimonotone, and nonmonotone probability aggregates. We show that the proposed probability answer set semantics of DHPP subsumes both the original probability answer set semantics of DHPP and the classical answer set semantics of classical disjunctive logic programs with classical aggregates, and consequently subsumes the classical answer set semantics of the original disjunctive logic programs. We show that the proposed probability answer sets of DHPP with probability aggregates are minimal probability models and hence incomparable, which is an important property for nonmonotonic probability reasoning.


Pattern-Based Constraint Satisfaction and Logic Puzzles

arXiv.org Artificial Intelligence

Pattern-Based Constraint Satisfaction and Logic Puzzles develops a pure logic, pattern-based perspective of solving the finite Constraint Satisfaction Problem (CSP), with emphasis on finding the "simplest" solution. Different ways of reasoning with the constraints are formalised by various families of "resolution rules", each of them carrying its own notion of simplicity. A large part of the book illustrates the power of the approach by applying it to various popular logic puzzles. It provides a unified view of how to model and solve them, even though they involve very different types of constraints: obvious symmetric ones in Sudoku, non-symmetric but transitive ones (inequalities) in Futoshiki, topological and geometric ones in Map colouring, Numbrix and Hidato, and even much more complex non-binary arithmetic ones in Kakuro (or Cross Sums). It also shows that the most familiar techniques for these puzzles can indeed be understood as mere application-specific presentations of the general rules. Sudoku is used as the main example throughout the book, making it also an advanced level sequel to "The Hidden Logic of Sudoku" (another book by the same author), with: many examples of relationships among different rules and of exceptional situations; comparisons of the resolution potential of various families of rules; detailed statistics of puzzles hardness; analysis of extreme instances.