Goto

Collaborating Authors

 Europe


Bayesian Nonparametric Models on Decomposable Graphs

Neural Information Processing Systems

Over recent years Dirichlet processes and the associated Chinese restaurant process (CRP) have found many applications in clustering while the Indian buffet process (IBP) is increasingly used to describe latent feature models. In the clustering case, we associate to each data point a latent allocation variable. These latent variables can share the same value and this induces a partition of the data set. The CRP is a prior distribution on such partitions. In latent feature models, we associate to each data point a potentially infinite number of binary latent variables indicating the possession of some features and the IBP is a prior distribution on the associated infinite binary matrix. These prior distributions are attractive because they ensure exchangeability (over samples). We propose here extensions of these models to decomposable graphs. These models have appealing properties and can be easily learned using Monte Carlo techniques.


Nash Equilibria of Static Prediction Games

Neural Information Processing Systems

The standard assumption of identically distributed training and test data can be violated when an adversary can exercise some control over the generation of the test data. In a prediction game, a learner produces a predictive model while an adversary may alter the distribution of input data. We study single-shot prediction games in which the cost functions of learner and adversary are not necessarily antagonistic. We identify conditions under which the prediction game has a unique Nash equilibrium, and derive algorithms that will find the equilibrial prediction models. In a case study, we explore properties of Nash-equilibrial prediction models for email spam filtering empirically.


Augmenting Feature-driven fMRI Analyses: Semi-supervised learning and resting state activity

Neural Information Processing Systems

Resting state activity is brain activation that arises in the absence of any task, and is usually measured in awake subjects during prolonged fMRI scanning sessions where the only instruction given is to close the eyes and do nothing. It has been recognized in recent years that resting state activity is implicated in a wide variety of brain function. While certain networks of brain areas have different levels of activation at rest and during a task, there is nevertheless significant similarity between activations in the two cases. This suggests that recordings of resting state activity can be used as a source of unlabeled data to augment discriminative regression techniques in a semi-supervised setting. We evaluate this setting empirically yielding three main results: (i) regression tends to be improved by the use of Laplacian regularization even when no additional unlabeled data are available, (ii) resting state data may have a similar marginal distribution to that recorded during the execution of a visual processing task reinforcing the hypothesis that these conditions have similar types of activation, and (iii) this source of information can be broadly exploited to improve the robustness of empirical inference in fMRI studies, an inherently data poor domain.


Neurometric function analysis of population codes

Neural Information Processing Systems

The relative merits of different population coding schemes have mostly been analyzed in the framework of stimulus reconstruction using Fisher Information. Here, we consider the case of stimulus discrimination in a two alternative forced choice paradigm and compute neurometric functions in terms of the minimal discrimination error and the Jensen-Shannon information to study neural population codes. We first explore the relationship between minimum discrimination error, Jensen-Shannon Information and Fisher Information and show that the discrimination framework is more informative about the coding accuracy than Fisher Information as it defines an error for any pair of possible stimuli. In particular, it includes Fisher Information as a special case. Second, we use the framework to study population codes of angular variables. Specifically, we assess the impact of different noise correlations structures on coding accuracy in long versus short decoding time windows. That is, for long time window we use the common Gaussian noise approximation. To address the case of short time windows we analyze the Ising model with identical noise correlation structure. In this way, we provide a new rigorous framework for assessing the functional consequences of noise correlation structures for the representational accuracy of neural population codes that is in particular applicable to short-time population coding.


Data-driven calibration of linear estimators with minimal penalties

Neural Information Processing Systems

This paper tackles the problem of selecting among several linear estimators in non-parametric regression; this includes model selection for linear regression, the choice of a regularization parameter in kernel ridge regression or spline smoothing, and the choice of a kernel in multiple kernel learning. We propose a new algorithm which first estimates consistently the variance of the noise, based upon the concept of minimal penalty which was previously introduced in the context of model selection. Then, plugging our variance estimate in Mallows $C_L$ penalty is proved to lead to an algorithm satisfying an oracle inequality. Simulation experiments with kernel ridge regression and multiple kernel learning show that the proposed algorithm often improves significantly existing calibration procedures such as 10-fold cross-validation or generalized cross-validation.


Soft Goals Can Be Compiled Away

Journal of Artificial Intelligence Research

Soft goals extend the classical model of planning with a simple model of preferences. The best plans are then not the ones with least cost but the ones with maximum utility, where the utility of a plan is the sum of the utilities of the soft goals achieved minus the plan cost. Finding plans with high utility appears to involve two linked problems: choosing a subset of soft goals to achieve and finding a low-cost plan to achieve them. New search algorithms and heuristics have been developed for planning with soft goals, and a new track has been introduced in the International Planning Competition (IPC) to test their performance. In this note, we show however that these extensions are not needed: soft goals do not increase the expressive power of the basic model of planning with action costs, as they can easily be compiled away. We apply this compilation to the problems of the net-benefit track of the most recent IPC, and show that optimal and satisficing cost-based planners do better on the compiled problems than optimal and satisficing net-benefit planners on the original problems with explicit soft goals. Furthermore, we show that penalties, or negative preferences expressing conditions to avoid, can also be compiled away using a similar idea.


The Computational Structure of Spike Trains

arXiv.org Machine Learning

Neurons perform computations, and convey the results of those computations through the statistical structure of their output spike trains. Here we present a practical method, grounded in the information-theoretic analysis of prediction, for inferring a minimal representation of that structure and for characterizing its complexity. Starting from spike trains, our approach finds their causal state models (CSMs), the minimal hidden Markov models or stochastic automata capable of generating statistically identical time series. We then use these CSMs to objectively quantify both the generalizable structure and the idiosyncratic randomness of the spike train. Specifically, we show that the expected algorithmic information content (the information needed to describe the spike train exactly) can be split into three parts describing (1) the time-invariant structure (complexity) of the minimal spike-generating process, which describes the spike train statistically; (2) the randomness (internal entropy rate) of the minimal spike-generating process; and (3) a residual pure noise term not described by the minimal spike-generating process. We use CSMs to approximate each of these quantities. The CSMs are inferred nonparametrically from the data, making only mild regularity assumptions, via the causal state splitting reconstruction algorithm. The methods presented here complement more traditional spike train analyses by describing not only spiking probability and spike train entropy, but also the complexity of a spike train's structure. We demonstrate our approach using both simulated spike trains and experimental data recorded in rat barrel cortex during vibrissa stimulation.


Oriented Straight Line Segment Algebra: Qualitative Spatial Reasoning about Oriented Objects

arXiv.org Artificial Intelligence

Nearly 15 years ago, a set of qualitative spatial relations between oriented straight line segments (dipoles) was suggested by Schlieder. This work received substantial interest amongst the qualitative spatial reasoning community. However, it turned out to be difficult to establish a sound constraint calculus based on these relations. In this paper, we present the results of a new investigation into dipole constraint calculi which uses algebraic methods to derive sound results on the composition of relations and other properties of dipole calculi. Our results are based on a condensed semantics of the dipole relations. In contrast to the points that are normally used, dipoles are extended and have an intrinsic direction. Both features are important properties of natural objects. This allows for a straightforward representation of prototypical reasoning tasks for spatial agents. As an example, we show how to generate survey knowledge from local observations in a street network. The example illustrates the fast constraint-based reasoning capabilities of the dipole calculus. We integrate our results into two reasoning tools which are publicly available.


A general approach to belief change in answer set programming

arXiv.org Artificial Intelligence

We address the problem of belief change in (nonmonotonic) logic programming under answer set semantics. Unlike previous approaches to belief change in logic programming, our formal techniques are analogous to those of distance-based belief revision in propositional logic. In developing our results, we build upon the model theory of logic programs furnished by SE models. Since SE models provide a formal, monotonic characterisation of logic programs, we can adapt techniques from the area of belief revision to belief change in logic programs. We introduce methods for revising and merging logic programs, respectively. For the former, we study both subset-based revision as well as cardinality-based revision, and we show that they satisfy the majority of the AGM postulates for revision. For merging, we consider operators following arbitration merging and IC merging, respectively. We also present encodings for computing the revision as well as the merging of logic programs within the same logic programming framework, giving rise to a direct implementation of our approach in terms of off-the-shelf answer set solvers. These encodings reflect in turn the fact that our change operators do not increase the complexity of the base formalism.


A conversion between utility and information

arXiv.org Artificial Intelligence

Rewards typically express desirabilities or preferences over a set of alternatives. Here we propose that rewards can be defined for any probability distribution based on three desiderata, namely that rewards should be real-valued, additive and order-preserving, where the latter implies that more probable events should also be more desirable. Our main result states that rewards are then uniquely determined by the negative information content. To analyze stochastic processes, we define the utility of a realization as its reward rate. Under this interpretation, we show that the expected utility of a stochastic process is its negative entropy rate. Furthermore, we apply our results to analyze agent-environment interactions. We show that the expected utility that will actually be achieved by the agent is given by the negative cross-entropy from the input-output (I/O) distribution of the coupled interaction system and the agent's I/O distribution. Thus, our results allow for an information-theoretic interpretation of the notion of utility and the characterization of agent-environment interactions in terms of entropy dynamics.