Information Technology
An MDP-Based Approach to Online Mechanism Design
Parkes, David C., Singh, Satinder P.
Online mechanism design (MD) considers the problem of providing incentives to implement desired system-wide outcomes in systems with self-interested agents that arrive and depart dynamically. Agents can choose to misrepresent their arrival and departure times, in addition to information about their value for different outcomes. We consider the problem of maximizing the total longterm value of the system despite the self-interest of agents. The online MD problem induces a Markov Decision Process (MDP), which when solved can be used to implement optimal policies in a truth-revealing Bayesian-Nash equilibrium.
Approximate Planning in POMDPs with Macro-Actions
Theocharous, Georgios, Kaelbling, Leslie P.
Recent research has demonstrated that useful POMDP solutions do not require consideration of the entire belief space. We extend this idea with the notion of temporal abstraction. We present and explore a new reinforcement learning algorithm over grid-points in belief space, which uses macro-actions and Monte Carlo updates of the Q-values. We apply the algorithm to a large scale robot navigation task and demonstrate that with temporal abstraction we can consider an even smaller part of the belief space, we can learn POMDP policies faster, and we can do information gathering more efficiently.
Self-calibrating Probability Forecasting
Vovk, Vladimir, Shafer, Glenn, Nouretdinov, Ilia
In the problem of probability forecasting the learner's goal is to output, given a training set and a new object, a suitable probability measure on the possible values of the new object's label. An online algorithm for probability forecasting is said to be well-calibrated if the probabilities it outputs agree with the observed frequencies. We give a natural nonasymptotic formalization of the notion of well-calibratedness, which we then study under the assumption of randomness (the object/label pairs are independent and identically distributed). It turns out that, although no probability forecasting algorithm is automatically well-calibrated in our sense, there exists a wide class of algorithms for "multiprobability forecasting" (such algorithms are allowed to output a set, ideally very narrow, of probability measures) which satisfy this property; we call the algorithms in this class "Venn probability machines". Our experimental results demonstrate that a 1-Nearest Neighbor Venn probability machine performs reasonably well on a standard benchmark data set, and one of our theoretical results asserts that a simple Venn probability machine asymptotically approaches the true conditional probabilities regardless, and without knowledge, of the true probability measure generating the examples.
Near-Minimax Optimal Classification with Dyadic Classification Trees
The classifiers are based on dyadic classification trees (DCTs), which involve adaptively pruned partitions of the feature space. A key aspect of DCTs is their spatial adaptivity, which enables local (ratherthan global) fitting of the decision boundary. Our risk analysis involves a spatial decomposition of the usual concentration inequalities, leading to a spatially adaptive, data-dependent pruning criterion. For any distribution on (X, Y) whose Bayes decision boundary behaves locally like a Lipschitz smooth function, we show that the DCT error converges to the Bayes error at a rate within a logarithmic factor of the minimax optimal rate.
Eye Movements for Reward Maximization
Sprague, Nathan, Ballard, Dana
Recent eye tracking studies in natural tasks suggest that there is a tight link between eye movements and goal directed motor actions. However, most existing models of human eye movements provide a bottom up account thatrelates visual attention to attributes of the visual scene. The purpose of this paper is to introduce a new model of human eye movements thatdirectly ties eye movements to the ongoing demands of behavior. Thebasic idea is that eye movements serve to reduce uncertainty about environmental variables that are task relevant. A value is assigned to an eye movement by estimating the expected cost of the uncertainty that will result if the movement is not made.
A Fast Multi-Resolution Method for Detection of Significant Spatial Disease Clusters
Neill, Daniel B., Moore, Andrew W.
Given an N N grid of squares, where each square has a count and an underlying population,our goal is to find the square region with the highest density, and to calculate its significance by randomization. Any density measure D, dependent on the total count and total population of a region, canbe used. For example, if each count represents the number of disease cases occurring in that square, we can use Kulldorff's spatial scan statistic D
Error Bounds for Transductive Learning via Compression and Clustering
Derbeko, Philip, El-Yaniv, Ran, Meir, Ron
This paper is concerned with transductive learning. Although transduction appears to be an easier task than induction, there have not been many provably useful algorithms and bounds for transduction. We present explicit error bounds for transduction and derive a general technique for devising bounds within this setting. The technique is applied to derive error bounds for compression schemes such as (transductive) SVMs and for transduction algorithms based on clustering.
Pairwise Clustering and Graphical Models
Shental, Noam, Zomet, Assaf, Hertz, Tomer, Weiss, Yair
Significant progress in clustering has been achieved by algorithms that are based on pairwise affinities between the datapoints. In particular, spectral clustering methods have the advantage of being able to divide arbitrarily shaped clusters and are based on efficient eigenvector calculations. However,spectral methods lack a straightforward probabilistic interpretation which makes it difficult to automatically set parameters using trainingdata. In this paper we use the previously proposed typical cut framework for pairwise clustering. We show an equivalence between calculating the typical cut and inference in an undirected graphical model. We show that for clustering problems with hundreds of datapoints exact inference may still be possible. For more complicated datasets, we show that loopy belief propagation(BP) and generalized belief propagation (GBP) can give excellent results on challenging clustering problems. We also use graphical modelsto derive a learning algorithm for affinity matrices based on labeled data.