Goto

Collaborating Authors

 Country


Reinforcement Learning for the Soccer Dribbling Task

arXiv.org Machine Learning

We propose a reinforcement learning solution to the \emph{soccer dribbling task}, a scenario in which a soccer agent has to go from the beginning to the end of a region keeping possession of the ball, as an adversary attempts to gain possession. While the adversary uses a stationary policy, the dribbler learns the best action to take at each decision point. After defining meaningful variables to represent the state space, and high-level macro-actions to incorporate domain knowledge, we describe our application of the reinforcement learning algorithm \emph{Sarsa} with CMAC for function approximation. Our experiments show that, after the training period, the dribbler is able to accomplish its task against a strong adversary around 58% of the time.


Joint and individual variation explained (JIVE) for integrated analysis of multiple data types

arXiv.org Machine Learning

Research in several fields now requires the analysis of data sets in which multiple high-dimensional types of data are available for a common set of objects. In particular, The Cancer Genome Atlas (TCGA) includes data from several diverse genomic technologies on the same cancerous tumor samples. In this paper we introduce Joint and Individual Variation Explained (JIVE), a general decomposition of variation for the integrated analysis of such data sets. The decomposition consists of three terms: a low-rank approximation capturing joint variation across data types, low-rank approximations for structured variation individual to each data type, and residual noise. JIVE quantifies the amount of joint variation between data types, reduces the dimensionality of the data and provides new directions for the visual exploration of joint and individual structures. The proposed method represents an extension of Principal Component Analysis and has clear advantages over popular two-block methods such as Canonical Correlation Analysis and Partial Least Squares. A JIVE analysis of gene expression and miRNA data on Glioblastoma Multiforme tumor samples reveals gene-miRNA associations and provides better characterization of tumor types. Data and software are available at https://genome.unc.edu/jive/


Integrating Prior Knowledge Into Prognostic Biomarker Discovery based on Network Structure

arXiv.org Machine Learning

Background: Predictive, stable and interpretable gene signatures are generally seen as an important step towards a better personalized medicine. During the last decade various methods have been proposed for that purpose. However, one important obstacle for making gene signatures a standard tool in clinics is the typical low reproducibility of these signatures combined with the difficulty to achieve a clear biological interpretation. For that purpose in the last years there has been a growing interest in approaches that try to integrate information from molecular interaction networks. Results: We propose a novel algorithm, called FrSVM, which integrates protein-protein interaction network information into gene selection for prognostic biomarker discovery. Our method is a simple filter based approach, which focuses on central genes with large differences in their expression. Compared to several other competing methods our algorithm reveals a significantly better prediction performance and higher signature stability. More- over, obtained gene lists are highly enriched with known disease genes and drug targets. We extendd our approach further by integrating information on candidate disease genes and targets of disease associated Transcript Factors (TFs).


Some results on a $\chi$-divergence, an~extended~Fisher information and~generalized~Cram\'er-Rao inequalities

arXiv.org Machine Learning

We propose a modified $\chi^{\beta}$-divergence, give some of its properties, and show that this leads to the definition of a generalized Fisher information. We give generalized Cram\'er-Rao inequalities, involving this Fisher information, an extension of the Fisher information matrix, and arbitrary norms and power of the estimation error. In the case of a location parameter, we obtain new characterizations of the generalized $q$-Gaussians, for instance as the distribution with a given moment that minimizes the generalized Fisher information. Finally we indicate how the generalized Fisher information can lead to new uncertainty relations.


On some interrelations of generalized $q$-entropies and a generalized Fisher information, including a Cram\'er-Rao inequality

arXiv.org Machine Learning

In this communication, we describe some interrelations between generalized $q$-entropies and a generalized version of Fisher information. In information theory, the de Bruijn identity links the Fisher information and the derivative of the entropy. We show that this identity can be extended to generalized versions of entropy and Fisher information. More precisely, a generalized Fisher information naturally pops up in the expression of the derivative of the Tsallis entropy. This generalized Fisher information also appears as a special case of a generalized Fisher information for estimation problems. Indeed, we derive here a new Cram\'er-Rao inequality for the estimation of a parameter, which involves a generalized form of Fisher information. This generalized Fisher information reduces to the standard Fisher information as a particular case. In the case of a translation parameter, the general Cram\'er-Rao inequality leads to an inequality for distributions which is saturated by generalized $q$-Gaussian distributions. These generalized $q$-Gaussians are important in several areas of physics and mathematics. They are known to maximize the $q$-entropies subject to a moment constraint. The Cram\'er-Rao inequality shows that the generalized $q$-Gaussians also minimize the generalized Fisher information among distributions with a fixed moment. Similarly, the generalized $q$-Gaussians also minimize the generalized Fisher information among distributions with a given $q$-entropy.


Characterizing A Database of Sequential Behaviors with Latent Dirichlet Hidden Markov Models

arXiv.org Machine Learning

This paper proposes a generative model, the latent Dirichlet hidden Markov models (LDHMM), for characterizing a database of sequential behaviors (sequences). LDHMMs posit that each sequence is generated by an underlying Markov chain process, which are controlled by the corresponding parameters (i.e., the initial state vector, transition matrix and the emission matrix). These sequence-level latent parameters for each sequence are modeled as latent Dirichlet random variables and parameterized by a set of deterministic database-level hyper-parameters. Through this way, we expect to model the sequence in two levels: the database level by deterministic hyper-parameters and the sequence-level by latent parameters. To learn the deterministic hyper-parameters and approximate posteriors of parameters in LDHMMs, we propose an iterative algorithm under the variational EM framework, which consists of E and M steps. We examine two different schemes, the fully-factorized and partially-factorized forms, for the framework, based on different assumptions. We present empirical results of behavior modeling and sequence classification on three real-world data sets, and compare them to other related models. The experimental results prove that the proposed LDHMMs produce better generalization performance in terms of log-likelihood and deliver competitive results on the sequence classification problem.


Adapting the Stochastic Block Model to Edge-Weighted Networks

arXiv.org Machine Learning

We generalize the stochastic block model to the important case in which edges are annotated with weights drawn from an exponential family distribution. This generalization introduces several technical difficulties for model estimation, which we solve using a Bayesian approach. We introduce a variational algorithm that efficiently approximates the model's posterior distribution for dense graphs. In specific numerical experiments on edge-weighted networks, this weighted stochastic block model outperforms the common approach of first applying a single threshold to all weights and then applying the classic stochastic block model, which can obscure latent block structure in networks. This model will enable the recovery of latent structure in a broader range of network data than was previously possible.


Learning Topic Models and Latent Bayesian Networks Under Expansion Constraints

arXiv.org Machine Learning

Unsupervised estimation of latent variable models is a fundamental problem central to numerous applications of machine learning and statistics. This work presents a principled approach for estimating broad classes of such models, including probabilistic topic models and latent linear Bayesian networks, using only second-order observed moments. The sufficient conditions for identifiability of these models are primarily based on weak expansion constraints on the topic-word matrix, for topic models, and on the directed acyclic graph, for Bayesian networks. Because no assumptions are made on the distribution among the latent variables, the approach can handle arbitrary correlations among the topics or latent factors. In addition, a tractable learning method via $\ell_1$ optimization is proposed and studied in numerical experiments.


Density-sensitive semisupervised inference

arXiv.org Machine Learning

Semisupervised methods are techniques for using labeled data $(X_1,Y_1),\ldots,(X_n,Y_n)$ together with unlabeled data $X_{n+1},\ldots,X_N$ to make predictions. These methods invoke some assumptions that link the marginal distribution $P_X$ of X to the regression function f(x). For example, it is common to assume that f is very smooth over high density regions of $P_X$. Many of the methods are ad-hoc and have been shown to work in specific examples but are lacking a theoretical foundation. We provide a minimax framework for analyzing semisupervised methods. In particular, we study methods based on metrics that are sensitive to the distribution $P_X$. Our model includes a parameter $\alpha$ that controls the strength of the semisupervised assumption. We then use the data to adapt to $\alpha$.


A Primal Condition for Approachability with Partial Monitoring

arXiv.org Machine Learning

In approachability with full monitoring there are two types of conditions that are known to be equivalent for convex sets: a primal and a dual condition. The primal one is of the form: a set C is approachable if and only all containing half-spaces are approachable in the one-shot game; while the dual one is of the form: a convex set C is approachable if and only if it intersects all payoff sets of a certain form. We consider approachability in games with partial monitoring. In previous works (Perchet 2011; Mannor et al. 2011) we provided a dual characterization of approachable convex sets; we also exhibited efficient strategies in the case where C is a polytope. In this paper we provide primal conditions on a convex set to be approachable with partial monitoring. They depend on a modified reward function and lead to approachability strategies, based on modified payoff functions, that proceed by projections similarly to Blackwell's (1956) strategy; this is in contrast with previously studied strategies in this context that relied mostly on the signaling structure and aimed at estimating well the distributions of the signals received. Our results generalize classical results by Kohlberg 1975 (see also Mertens et al. 1994) and apply to games with arbitrary signaling structure as well as to arbitrary convex sets.