Goto

Collaborating Authors

 Europe


Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization

Journal of Artificial Intelligence Research

Many problems in artificial intelligence require adaptively making a sequence of decisions with uncertain outcomes under partial observability. Solving such stochastic optimization problems is a fundamental but notoriously difficult challenge. In this paper, we introduce the concept of adaptive submodularity, generalizing submodular set functions to adaptive policies. We prove that if a problem satisfies this property, a simple adaptive greedy algorithm is guaranteed to be competitive with the optimal policy. In addition to providing performance guarantees for both stochastic maximization and coverage, adaptive submodularity can be exploited to drastically speed up the greedy algorithm by using lazy evaluations. We illustrate the usefulness of the concept by giving several examples of adaptive submodular objectives arising in diverse AI applications including management of sensing resources, viral marketing and active learning. Proving adaptive submodularity for these problems allows us to recover existing results in these applications as special cases, improve approximation guarantees and handle natural generalizations.


On l_1 Mean and Variance Filtering

arXiv.org Machine Learning

This paper addresses the problem of segmenting a time-series with respect to changes in the mean value or in the variance. The first case is when the time data is modeled as a sequence of independent and normal distributed random variables with unknown, possibly changing, mean value but fixed variance. The main assumption is that the mean value is piecewise constant in time, and the task is to estimate the change times and the mean values within the segments. The second case is when the mean value is constant, but the variance can change. The assumption is that the variance is piecewise constant in time, and we want to estimate change times and the variance values within the segments. To find solutions to these problems, we will study an l_1 regularized maximum likelihood method, related to the fused lasso method and l_1 trend filtering, where the parameters to be estimated are free to vary at each sample. To penalize variations in the estimated parameters, the l_1-norm of the time difference of the parameters is used as a regularization term. This idea is closely related to total variation denoising. The main contribution is that a convex formulation of this variance estimation problem, where the parametrization is based on the inverse of the variance, can be formulated as a certain l_1 mean estimation problem. This implies that results and methods for mean estimation can be applied to the challenging problem of variance segmentation/estimation.


Principles of Solomonoff Induction and AIXI

arXiv.org Artificial Intelligence

We identify principles characterizing Solomonoff Induction by demands on an agent's external behaviour. Key concepts are rationality, computability, indifference and time consistency. Furthermore, we discuss extensions to the full AI case to derive AIXI.


Receiver Architectures for MIMO-OFDM Based on a Combined VMP-SP Algorithm

arXiv.org Machine Learning

Iterative information processing, either based on heuristics or analytical frameworks, has been shown to be a very powerful tool for the design of efficient, yet feasible, wireless receiver architectures. Within this context, algorithms performing message-passing on a probabilistic graph, such as the sum-product (SP) and variational message passing (VMP) algorithms, have become increasingly popular. In this contribution, we apply a combined VMP-SP message-passing technique to the design of receivers for MIMO-ODFM systems. The message-passing equations of the combined scheme can be obtained from the equations of the stationary points of a constrained region-based free energy approximation. When applied to a MIMO-OFDM probabilistic model, we obtain a generic receiver architecture performing iterative channel weight and noise precision estimation, equalization and data decoding. We show that this generic scheme can be particularized to a variety of different receiver structures, ranging from high-performance iterative structures to low complexity receivers. This allows for a flexible design of the signal processing specially tailored for the requirements of each specific application. The numerical assessment of our solutions, based on Monte Carlo simulations, corroborates the high performance of the proposed algorithms and their superiority to heuristic approaches.


Hodge Theory on Metric Spaces

arXiv.org Machine Learning

Hodge theory is a beautiful synthesis of geometry, topology, and analysis, which has been developed in the setting of Riemannian manifolds. On the other hand, spaces of images, which are important in the mathematical foundations of vision and pattern recognition, do not fit this framework. This motivates us to develop a version of Hodge theory on metric spaces with a probability measure. We believe that this constitutes a step towards understanding the geometry of vision. The appendix by Anthony Baker provides a separable, compact metric space with infinite dimensional \alpha-scale homology.


Revisiting Numerical Pattern Mining with Formal Concept Analysis

arXiv.org Artificial Intelligence

In this paper, we investigate the problem of mining numerical data in the framework of Formal Concept Analysis. The usual way is to use a scaling procedure --transforming numerical attributes into binary ones-- leading either to a loss of information or of efficiency, in particular w.r.t. the volume of extracted patterns. By contrast, we propose to directly work on numerical data in a more precise and efficient way, and we prove it. For that, the notions of closed patterns, generators and equivalent classes are revisited in the numerical context. Moreover, two original algorithms are proposed and used in an evaluation involving real-world data, showing the predominance of the present approach.


Falsification and future performance

arXiv.org Machine Learning

We show these capacity measures count the number of hypotheses about a dataset that a learning algorithm falsifies when it finds the classifier in its repertoire minimizing empirical risk. It then follows from that the future performance of predictors on unseen data is controlled in part by how many hypotheses the learner falsifies. As a corollary we show that empirical VC-entropy quantifies the message length of the true hypothesis in the optimal code of a particular probability distribution, the so-called actual repertoire.


Approximate Judgement Aggregation

arXiv.org Artificial Intelligence

In this paper we analyze judgement aggregation problems in which a group of agents independently votes on a set of complex propositions that has some interdependency constraint between them(e.g., transitivity when describing preferences). We consider the issue of judgement aggregation from the perspective of approximation. That is, we generalize the previous results by studying approximate judgement aggregation. We relax the main two constraints assumed in the current literature, Consistency and Independence and consider mechanisms that only approximately satisfy these constraints, that is, satisfy them up to a small portion of the inputs. The main question we raise is whether the relaxation of these notions significantly alters the class of satisfying aggregation mechanisms. The recent works for preference aggregation of Kalai, Mossel, and Keller fit into this framework. The main result of this paper is that, as in the case of preference aggregation, in the case of a subclass of a natural class of aggregation problems termed `truth-functional agendas', the set of satisfying aggregation mechanisms does not extend non-trivially when relaxing the constraints. Our proof techniques involve Boolean Fourier transform and analysis of voter influences for voting protocols. The question we raise for Approximate Aggregation can be stated in terms of Property Testing. For instance, as a corollary from our result we get a generalization of the classic result for property testing of linearity of Boolean functions. An updated version (RePEc:huj:dispap:dp574R) is available at http://www.ratio.huji.ac.il/dp_files/dp574R.pdf


Making Decisions Using Sets of Probabilities: Updating, Time Consistency, and Calibration

Journal of Artificial Intelligence Research

We consider how an agent should update her beliefs when her beliefs are represented by a set P of probability distributions, given that the agent makes decisions using the minimax criterion, perhaps the best-studied and most commonly-used criterion in the literature. We adopt a game-theoretic framework, where the agent plays against a bookie, who chooses some distribution from P. We consider two reasonable games that differ in what the bookie knows when he makes his choice. Anomalies that have been observed before, like time inconsistency, can be understood as arising because different games are being played, against bookies with different information. We characterize the important special cases in which the optimal decision rules according to the minimax criterion amount to either conditioning or simply ignoring the information. Finally, we consider the relationship between updating and calibration when uncertainty is described by sets of probabilities. Our results emphasize the key role of the rectangularity condition of Epstein and Schneider.


A Bayesian Model for Plan Recognition in RTS Games applied to StarCraft

arXiv.org Artificial Intelligence

The task of keyhole (unobtrusive) plan recognition is central to adaptive game AI. "Tech trees" or "build trees" are the core of real-time strategy (RTS) game strategic (long term) planning. This paper presents a generic and simple Bayesian model for RTS build tree prediction from noisy observations, which parameters are learned from replays (game logs). This unsupervised machine learning approach involves minimal work for the game developers as it leverage players' data (com- mon in RTS). We applied it to StarCraft1 and showed that it yields high quality and robust predictions, that can feed an adaptive AI.