Country
Regret Minimization in Games with Incomplete Information
Zinkevich, Martin, Johanson, Michael, Bowling, Michael, Piccione, Carmelo
Extensive games are a powerful model of multiagent decision-making scenarios with incomplete information. Finding a Nash equilibrium for very large instances of these games has received a great deal of recent attention. In this paper, we describe a new technique for solving large games based on regret minimization. In particular, we introduce the notion of counterfactual regret, which exploits the degree of incomplete information in an extensive game. We show how minimizing counterfactual regret minimizes overall regret, and therefore in self-play can be used to compute a Nash equilibrium. We demonstrate this technique in the domain of poker, showing we can solve abstractions of limit Texas Hold'em with as many as 10
Bayesian Co-Training
Yu, Shipeng, Krishnapuram, Balaji, Steck, Harald, Rao, R. B., Rosales, Rómer
We propose a Bayesian undirected graphical model for co-training, or more generally for semi-supervised multi-view learning. This makes explicit the previously unstated assumptions of a large class of co-training type algorithms, and also clarifies the circumstances under which these assumptions fail. Building upon new insights from this model, we propose an improved method for co-training, which is a novel co-training kernel for Gaussian process classifiers. The resulting approach is convex and avoids local-maxima problems, unlike some previous multi-view learning methods. Furthermore, it can automatically estimate how much each view should be trusted, and thus accommodate noisy or unreliable views. Experiments on toy data and real world data sets illustrate the benefits of this approach.
Classification via Minimum Incremental Coding Length (MICL)
Wright, John, Tao, Yangyu, Lin, Zhouchen, Ma, Yi, Shum, Heung-yeung
We present a simple new criterion for classification, based on principles from lossy data compression. The criterion assigns a test sample to the class that uses the minimum number of additional bits to code the test sample, subject to an allowable distortion. We prove asymptotic optimality of this criterion for Gaussian data and analyze its relationships to classical classifiers. Theoretical results provide new insights into relationships among popular classifiers such as MAP and RDA, as well as unsupervised clustering methods based on lossy compression [13]. Minimizing the lossy coding length induces a regularization effect which stabilizes the (implicit) density estimate in a small-sample setting. Compression also provides a uniform means of handling classes of varying dimension. This simple classification criterion and its kernel and local versions perform competitively against existing classifiers on both synthetic examples and real imagery data such as handwritten digits and human faces, without requiring domain-specific information.
Exponential Family Predictive Representations of State
Wingate, David, Baveja, Satinder S.
In order to represent state in controlled, partially observable, stochastic dynamical systems, some sort of sufficient statistic for history is necessary. Predictive representations of state (PSRs) capture state as statistics of the future. We introduce a new model of such systems called the "Exponential family PSR," which defines as state the time-varying parameters of an exponential family distribution which models n sequential observations in the future. This choice of state representation explicitly connects PSRs to state-of-the-art probabilistic modeling, which allows us to take advantage of current efforts in high-dimensional density estimation, and in particular, graphical models and maximum entropy models. We present a parameter learning algorithm based on maximum likelihood, and we show how a variety of current approximate inference methods apply. We evaluate the quality of our model with reinforcement learning by directly evaluating the control performance of the model.
Infinite State Bayes-Nets for Structured Domains
Welling, Max, Porteous, Ian, Bart, Evgeniy
A general modeling framework is proposed that unifies nonparametric-Bayesian models, topic-models and Bayesian networks. This class of infinite state Bayes nets (ISBN) can be viewed as directed networks of'hierarchical Dirichlet processes' (HDPs) where the domain of the variables can be structured (e.g.
COFI RANK - Maximum Margin Matrix Factorization for Collaborative Ranking
Weimer, Markus, Karatzoglou, Alexandros, Le, Quoc V., Smola, Alex J.
In this paper, we consider collaborative filtering as a ranking problem. We present a method which uses Maximum Margin Matrix Factorization and optimizes ranking instead of rating. We employ structured output prediction to optimize directly for ranking scores. Experimental results show that our method gives very good ranking scores and scales well on collaborative filtering tasks.
Boosting Algorithms for Maximizing the Soft Margin
Rätsch, Gunnar, Warmuth, Manfred K. K., Glocer, Karen A.
We present a novel boosting algorithm, called SoftBoost, designed for sets of binary labeled examples that are not necessarily separable by convex combinations of base hypotheses. Our algorithm achieves robustness by capping the distributions on the examples. Our update of the distribution is motivated by minimizing a relative entropy subject to the capping constraints and constraints on the edges of the obtained base hypotheses. The capping constraints imply a soft margin in the dual optimization problem. Our algorithm produces a convex combination of hypotheses whose soft margin is within δ of its maximum.
Stable Dual Dynamic Programming
Wang, Tao, Bowling, Michael, Schuurmans, Dale, Lizotte, Daniel J.
Recently, we have introduced a novel approach to dynamic programming and reinforcement learning that is based on maintaining explicit representations of stationary distributions instead of value functions. In this paper, we investigate the convergence properties of these dual algorithms both theoretically and empirically, and show how they can be scaled up by incorporating function approximation.
Scene Segmentation with CRFs Learned from Partially Labeled Images
Triggs, Bill, Verbeek, Jakob J.
Conditional Random Fields (CRFs) are an effective tool for a variety of different data segmentation and labeling tasks including visual scene interpretation, which seeks to partition images into their constituent semantic-level regions and assign appropriate class labels to each region. For accurate labeling it is important to capture the global context of the image as well as local information. We introduce a CRF based scene labeling model that incorporates both local features and features aggregated over the whole image or large sections of it. Secondly, traditional CRF learning requires fully labeled datasets which can be costly and troublesome to produce. We introduce a method for learning CRFs from datasets with many unlabeled nodes by marginalizing out the unknown labels so that the log-likelihood of the known ones can be maximized by gradient ascent. Loopy Belief Propagation is used to approximate the marginals needed for the gradient and log-likelihood calculations and the Bethe free-energy approximation to the log-likelihood is monitored to control the step size. Our experimental results show that effective models can be learned from fragmentary labelings and that incorporating top-down aggregate features significantly improves the segmentations. The resulting segmentations are compared to the state-of-the-art on three different image datasets.