Goto

Collaborating Authors

 Country


Random Projections for $k$-means Clustering

arXiv.org Artificial Intelligence

This paper discusses the topic of dimensionality reduction for $k$-means clustering. We prove that any set of $n$ points in $d$ dimensions (rows in a matrix $A \in \RR^{n \times d}$) can be projected into $t = \Omega(k / \eps^2)$ dimensions, for any $\eps \in (0,1/3)$, in $O(n d \lceil \eps^{-2} k/ \log(d) \rceil )$ time, such that with constant probability the optimal $k$-partition of the point set is preserved within a factor of $2+\eps$. The projection is done by post-multiplying $A$ with a $d \times t$ random matrix $R$ having entries $+1/\sqrt{t}$ or $-1/\sqrt{t}$ with equal probability. A numerical implementation of our technique and experiments on a large face images dataset verify the speed and the accuracy of our theoretical results.


Should one compute the Temporal Difference fix point or minimize the Bellman Residual? The unified oblique projection view

arXiv.org Artificial Intelligence

We investigate projection methods, for evaluating a linear approximation of the value function of a policy in a Markov Decision Process context. We consider two popular approaches, the one-step Temporal Difference fix-point computation (TD(0)) and the Bellman Residual (BR) minimization. We describe examples, where each method outperforms the other. We highlight a simple relation between the objective function they minimize, and show that while BR enjoys a performance guarantee, TD(0) does not in general. We then propose a unified view in terms of oblique projections of the Bellman equation, which substantially simplifies and extends the characterization of (schoknecht,2002) and the recent analysis of (Yu & Bertsekas, 2008). Eventually, we describe some simulations that suggest that if the TD(0) solution is usually slightly better than the BR solution, its inherent numerical instability makes it very bad in some cases, and thus worse on average.


Supervised Random Walks: Predicting and Recommending Links in Social Networks

arXiv.org Artificial Intelligence

Predicting the occurrence of links is a fundamental problem in networks. In the link prediction problem we are given a snapshot of a network and would like to infer which interactions among existing members are likely to occur in the near future or which existing interactions are we missing. Although this problem has been extensively studied, the challenge of how to effectively combine the information from the network structure with rich node and edge attribute data remains largely open. We develop an algorithm based on Supervised Random Walks that naturally combines the information from the network structure with node and edge level attributes. We achieve this by using these attributes to guide a random walk on the graph. We formulate a supervised learning task where the goal is to learn a function that assigns strengths to edges in the network such that a random walker is more likely to visit the nodes to which new links will be created in the future. We develop an efficient training algorithm to directly learn the edge strength estimation function. Our experiments on the Facebook social graph and large collaboration networks show that our approach outperforms state-of-the-art unsupervised approaches as well as approaches that are based on feature extraction.


Artificial Hormone Reaction Networks: Towards Higher Evolvability in Evolutionary Multi-Modular Robotics

arXiv.org Artificial Intelligence

The semi-automatic or automatic synthesis of robot controller software is both desirable and challenging. Synthesis of rather simple behaviors such as collision avoidance by applying artificial evolution has been shown multiple times. However, the difficulty of this synthesis increases heavily with increasing complexity of the task that should be performed by the robot. We try to tackle this problem of complexity with Artificial Homeostatic Hormone Systems (AHHS), which provide both intrinsic, homeostatic processes and (transient) intrinsic, variant behavior. By using AHHS the need for pre-defined controller topologies or information about the field of application is minimized. We investigate how the principle design of the controller and the hormone network size affects the overall performance of the artificial evolution (i.e., evolvability). This is done by comparing two variants of AHHS that show different effects when mutated. We evolve a controller for a robot built from five autonomous, cooperating modules. The desired behavior is a form of gait resulting in fast locomotion by using the modules' main hinges.


An Introduction to Conditional Random Fields

arXiv.org Machine Learning

Often we wish to predict a large number of variables that depend on each other as well as on other observed variables. Structured prediction methods are essentially a combination of classification and graphical modeling, combining the ability of graphical models to compactly model multivariate data with the ability of classification methods to perform prediction using large sets of input features. This tutorial describes conditional random fields, a popular probabilistic method for structured prediction. CRFs have seen wide application in natural language processing, computer vision, and bioinformatics. We describe methods for inference and parameter estimation for CRFs, including practical issues for implementing large scale CRFs. We do not assume previous knowledge of graphical modeling, so this tutorial is intended to be useful to practitioners in a wide variety of fields.


PADDLE: Proximal Algorithm for Dual Dictionaries LEarning

arXiv.org Machine Learning

Recently, considerable research efforts have been devoted to the design of methods to learn from data overcomplete dictionaries for sparse coding. However, learned dictionaries require the solution of an optimization problem for coding new data. In order to overcome this drawback, we propose an algorithm aimed at learning both a dictionary and its dual: a linear mapping directly performing the coding. By leveraging on proximal methods, our algorithm jointly minimizes the reconstruction error of the dictionary and the coding error of its dual; the sparsity of the representation is induced by an $\ell_1$-based penalty on its coefficients. The results obtained on synthetic data and real images show that the algorithm is capable of recovering the expected dictionaries. Furthermore, on a benchmark dataset, we show that the image features obtained from the dual matrix yield state-of-the-art classification performance while being much less computational intensive.


Optimizing real-time RDF data streams

arXiv.org Artificial Intelligence

The Resource Description Framework (RDF) provides a common data model for the integration of "real-time" social and sensor data streams with the Web and with each other. While there exist numerous protocols and data formats for exchanging dynamic RDF data, or RDF updates, these options should be examined carefully in order to enable a Semantic Web equivalent of the high-throughput, low-latency streams of typical Web 2.0, multimedia, and gaming applications. This paper contains a brief survey of RDF update formats and a high-level discussion of both TCP and UDP-based transport protocols for updates. Its main contribution is the experimental evaluation of a UDP-based architecture which serves as a real-world example of a high-performance RDF streaming application in an Internet-scale distributed environment.


The Inverse Task of the Reflexive Game Theory: Theoretical Matters, Practical Applications and Relationship with Other Issues

arXiv.org Artificial Intelligence

The Reflexive Game Theory (RGT) has been recently proposed by Vladimir Lefebvre to model behavior of individuals in groups. The goal of this study is to introduce the Inverse task. We consider methods of solution together with practical applications. We present a brief overview of the RGT for easy understanding of the problem. We also develop the schematic representation of the RGT inference algorithms to create the basis for soft- and hardware solutions of the RGT tasks. We propose a unified hierarchy of schemas to represent humans and robots. This hierarchy is considered as a unified framework to solve the entire spectrum of the RGT tasks. We conclude by illustrating how this framework can be applied for modeling of mixed groups of humans and robots. All together this provides the exhaustive solution of the Inverse task and clearly illustrates its role and relationships with other issues considered in the RGT.


Learning Planar Ising Models

arXiv.org Artificial Intelligence

Inference and learning of graphical models are both well-studied problems in statistics and machine learning that have found many applications in science and engineering. However, exact inference is intractable in general graphical models, which suggests the problem of seeking the best approximation to a collection of random variables within some tractable family of graphical models. In this paper, we focus our attention on the class of planar Ising models, for which inference is tractable using techniques of statistical physics [Kac and Ward; Kasteleyn]. Based on these techniques and recent methods for planarity testing and planar embedding [Chrobak and Payne], we propose a simple greedy algorithm for learning the best planar Ising model to approximate an arbitrary collection of binary random variables (possibly from sample data). Given the set of all pairwise correlations among variables, we select a planar graph and optimal planar Ising model defined on this graph to best approximate that set of correlations. We demonstrate our method in some simulations and for the application of modeling senate voting records.


Characterization of differentially expressed genes using high-dimensional co-expression networks

arXiv.org Machine Learning

We present a technique to characterize differentially expressed genes in terms of their position in a high-dimensional co-expression network. The set-up of Gaussian graphical models is used to construct representations of the co-expression network in such a way that redundancy and the propagation of spurious information along the network are avoided. The proposed inference procedure is based on the minimization of the Bayesian Information Criterion (BIC) in the class of decomposable graphical models. This class of models can be used to represent complex relationships and has suitable properties that allow to make effective inference in problems with high degree of complexity (e.g. several thousands of genes) and small number of observations (e.g. 10-100) as typically occurs in high throughput gene expression studies. Taking advantage of the internal structure of decomposable graphical models, we construct a compact representation of the co-expression network that allows to identify the regions with high concentration of differentially expressed genes. It is argued that differentially expressed genes located in highly interconnected regions of the co-expression network are less informative than differentially expressed genes located in less interconnected regions. Based on that idea, a measure of uncertainty that resembles the notion of relative entropy is proposed. Our methods are illustrated with three publically available data sets on microarray experiments (the larger involving more than 50,000 genes and 64 patients) and a short simulation study.