Goto

Collaborating Authors

 Country


Characterization of differentially expressed genes using high-dimensional co-expression networks

arXiv.org Machine Learning

We present a technique to characterize differentially expressed genes in terms of their position in a high-dimensional co-expression network. The set-up of Gaussian graphical models is used to construct representations of the co-expression network in such a way that redundancy and the propagation of spurious information along the network are avoided. The proposed inference procedure is based on the minimization of the Bayesian Information Criterion (BIC) in the class of decomposable graphical models. This class of models can be used to represent complex relationships and has suitable properties that allow to make effective inference in problems with high degree of complexity (e.g. several thousands of genes) and small number of observations (e.g. 10-100) as typically occurs in high throughput gene expression studies. Taking advantage of the internal structure of decomposable graphical models, we construct a compact representation of the co-expression network that allows to identify the regions with high concentration of differentially expressed genes. It is argued that differentially expressed genes located in highly interconnected regions of the co-expression network are less informative than differentially expressed genes located in less interconnected regions. Based on that idea, a measure of uncertainty that resembles the notion of relative entropy is proposed. Our methods are illustrated with three publically available data sets on microarray experiments (the larger involving more than 50,000 genes and 64 patients) and a short simulation study.


Learning Planar Ising Models

arXiv.org Artificial Intelligence

Inference and learning of graphical models are both well-studied problems in statistics and machine learning that have found many applications in science and engineering. However, exact inference is intractable in general graphical models, which suggests the problem of seeking the best approximation to a collection of random variables within some tractable family of graphical models. In this paper, we focus our attention on the class of planar Ising models, for which inference is tractable using techniques of statistical physics [Kac and Ward; Kasteleyn]. Based on these techniques and recent methods for planarity testing and planar embedding [Chrobak and Payne], we propose a simple greedy algorithm for learning the best planar Ising model to approximate an arbitrary collection of binary random variables (possibly from sample data). Given the set of all pairwise correlations among variables, we select a planar graph and optimal planar Ising model defined on this graph to best approximate that set of correlations. We demonstrate our method in some simulations and for the application of modeling senate voting records.


The Inverse Task of the Reflexive Game Theory: Theoretical Matters, Practical Applications and Relationship with Other Issues

arXiv.org Artificial Intelligence

The Reflexive Game Theory (RGT) has been recently proposed by Vladimir Lefebvre to model behavior of individuals in groups. The goal of this study is to introduce the Inverse task. We consider methods of solution together with practical applications. We present a brief overview of the RGT for easy understanding of the problem. We also develop the schematic representation of the RGT inference algorithms to create the basis for soft- and hardware solutions of the RGT tasks. We propose a unified hierarchy of schemas to represent humans and robots. This hierarchy is considered as a unified framework to solve the entire spectrum of the RGT tasks. We conclude by illustrating how this framework can be applied for modeling of mixed groups of humans and robots. All together this provides the exhaustive solution of the Inverse task and clearly illustrates its role and relationships with other issues considered in the RGT.


Convex Analysis and Optimization with Submodular Functions: a Tutorial

arXiv.org Machine Learning

Set-functions appear in many areas of computer science and applied mathematics, such as machine learning, computer vision, operations research or electrical networks. Among these set-functions, submodular functions play an important role, similar to convex functions on vector spaces. In this tutorial, the theory of submodular functions is presented, in a self-contained way, with all results shown from first principles. A good knowledge of convex analysis is assumed.


Graphical Models as Block-Tree Graphs

arXiv.org Machine Learning

We introduce block-tree graphs as a framework for deriving efficient algorithms on graphical models. We define block-tree graphs as a tree-structured graph where each node is a cluster of nodes such that the clusters in the graph are disjoint. This differs from junction-trees, where two clusters connected by an edge always have at least one common node. When compared to junction-trees, we show that constructing block-tree graphs is faster, and finding optimal block-tree graphs has a much smaller search space. Applying our block-tree graph framework to graphical models, we show that, for some graphs, e.g., grid graphs, using block-tree graphs for inference is computationally more efficient than using junction-trees. For graphical models with boundary conditions, the block-tree graph framework transforms the boundary valued problem into an initial value problem. For Gaussian graphical models, the block-tree graph framework leads to a linear state-space representation. Since exact inference in graphical models can be computationally intractable, we propose to use spanning block-trees to derive approximate inference algorithms. Experimental results show the improved performance in using spanning block-trees versus using spanning trees for approximate estimation over Gaussian graphical models.


Brain covariance selection: better individual functional connectivity models using population prior

arXiv.org Machine Learning

Spontaneous brain activity, as observed in functional neuroimaging, has been shown to display reproducible structure that expresses brain architecture and carries markers of brain pathologies. An important view of modern neuroscience is that such large-scale structure of coherent activity reflects modularity properties of brain connectivity graphs. However, to date, there has been no demonstration that the limited and noisy data available in spontaneous activity observations could be used to learn full-brain probabilistic models that generalize to new data. Learning such models entails two main challenges: i) modeling full brain connectivity is a difficult estimation problem that faces the curse of dimensionality and ii) variability between subjects, coupled with the variability of functional signals between experimental runs, makes the use of multiple datasets challenging. We describe subject-level brain functional connectivity structure as a multivariate Gaussian process and introduce a new strategy to estimate it from group data, by imposing a common structure on the graphical model in the population. We show that individual models learned from functional Magnetic Resonance Imaging (fMRI) data using this population prior generalize better to unseen data than models based on alternative regularization schemes. To our knowledge, this is the first report of a cross-validated model of spontaneous brain activity. Finally, we use the estimated graphical model to explore the large-scale characteristics of functional architecture and show for the first time that known cognitive networks appear as the integrated communities of functional connectivity graph.


Structured sparsity-inducing norms through submodular functions

arXiv.org Machine Learning

Sparse methods for supervised learning aim at finding good linear predictors from as few variables as possible, i.e., with small cardinality of their supports. This combinatorial selection problem is often turned into a convex optimization problem by replacing the cardinality function by its convex envelope (tightest convex lower bound), in this case the L1-norm. In this paper, we investigate more general set-functions than the cardinality, that may incorporate prior knowledge or structural constraints which are common in many applications: namely, we show that for nondecreasing submodular set-functions, the corresponding convex envelope can be obtained from its \lova extension, a common tool in submodular analysis. This defines a family of polyhedral norms, for which we provide generic algorithmic tools (subgradients and proximal operators) and theoretical results (conditions for support recovery or high-dimensional inference). By selecting specific submodular functions, we can give a new interpretation to known norms, such as those based on rank-statistics or grouped norms with potentially overlapping groups; we also define new norms, in particular ones that can be used as non-factorial priors for supervised learning.


Balanced Reduction of Nonlinear Control Systems in Reproducing Kernel Hilbert Space

arXiv.org Machine Learning

We introduce a novel data-driven order reduction method for nonlinear control systems, drawing on recent progress in machine learning and statistical dimensionality reduction. The method rests on the assumption that the nonlinear system behaves linearly when lifted into a high (or infinite) dimensional feature space where balanced truncation may be carried out implicitly. This leads to a nonlinear reduction map which can be combined with a representation of the system belonging to a reproducing kernel Hilbert space to give a closed, reduced order dynamical system which captures the essential input-output characteristics of the original model. Empirical simulations illustrating the approach are also provided.


Stability of Density-Based Clustering

arXiv.org Machine Learning

High density clusters can be characterized by the connected components of a level set $L(\lambda) = \{x:\ p(x)>\lambda\}$ of the underlying probability density function $p$ generating the data, at some appropriate level $\lambda\geq 0$. The complete hierarchical clustering can be characterized by a cluster tree ${\cal T}= \bigcup_{\lambda} L(\lambda)$. In this paper, we study the behavior of a density level set estimate $\widehat L(\lambda)$ and cluster tree estimate $\widehat{\cal{T}}$ based on a kernel density estimator with kernel bandwidth $h$. We define two notions of instability to measure the variability of $\widehat L(\lambda)$ and $\widehat{\cal{T}}$ as a function of $h$, and investigate the theoretical properties of these instability measures.


Exact block-wise optimization in group lasso and sparse group lasso for linear regression

arXiv.org Machine Learning

The group lasso is a penalized regression method, used in regression problems where the covariates are partitioned into groups to promote sparsity at the group level. Existing methods for finding the group lasso estimator either use gradient projection methods to update the entire coefficient vector simultaneously at each step, or update one group of coefficients at a time using an inexact line search to approximate the optimal value for the group of coefficients when all other groups' coefficients are fixed. We present a new method of computation for the group lasso in the linear regression case, the Single Line Search (SLS) algorithm, which operates by computing the exact optimal value for each group (when all other coefficients are fixed) with one univariate line search. We perform simulations demonstrating that the SLS algorithm is often more efficient than existing computational methods. We also extend the SLS algorithm to the sparse group lasso problem via the Signed Single Line Search (SSLS) algorithm, and give theoretical results to support both algorithms.