Goto

Collaborating Authors

 Industry


Learning annotated hierarchies from relational data

Neural Information Processing Systems

The objects in many real-world domains can be organized into hierarchies, where each internal node picks out a category of objects. Given a collection of features and relations defined over a set of objects, an annotated hierarchy includes a specification of the categories that are most useful for describing each individual feature and relation. We define a generative model for annotated hierarchies and the features and relations that they describe, and develop a Markov chain Monte Carlo scheme for learning annotated hierarchies. We show that our model discovers interpretable structure in several real-world data sets.


Computation of Similarity Measures for Sequential Data using Generalized Suffix Trees

Neural Information Processing Systems

We propose a generic algorithm for computation of similarity measures for sequential data. The algorithm uses generalized suffix trees for efficient calculation of various kernel, distance and non-metric similarity functions. Its worst-case run-time is linear in the length of sequences and independent of the underlying embedding language, which can cover words, k-grams or all contained subsequences. Experiments with network intrusion detection, DNA analysis and text processing applications demonstrate the utility of distances and similarity coefficients for sequences as alternatives to classical kernel functions.


Natural Actor-Critic for Road Traffic Optimisation

Neural Information Processing Systems

Current road-traffic optimisation practice around the world is a combination of hand tuned policies with a small degree of automatic adaption. Even state-ofthe-art research controllers need good models of the road traffic, which cannot be obtained directly from existing sensors. We use a policy-gradient reinforcement learning approach to directly optimise the traffic signals, mapping currently deployed sensor observations to control signals. Our trained controllers are (theoretically) compatible with the traffic system used in Sydney and many other cities around the world. We apply two policy-gradient methods: (1) the recent natural actor-critic algorithm, and (2) a vanilla policy-gradient algorithm for comparison. Along the way we extend natural-actor critic approaches to work for distributed and online infinite-horizon problems.


Unsupervised Regression with Applications to Nonlinear System Identification

Neural Information Processing Systems

We derive a cost functional for estimating the relationship between highdimensional observations and the low-dimensional process that generated them with no input-output examples. Limiting our search to invertible observation functions confers numerous benefits, including a compact representation and no suboptimal local minima. Our approximation algorithms for optimizing this cost functional are fast and give diagnostic bounds on the quality of their solution. Our method can be viewed as a manifold learning algorithm that utilizes a prior on the low-dimensional manifold coordinates. The benefits of taking advantage of such priors in manifold learning and searching for the inverse observation functions in system identification are demonstrated empirically by learning to track moving targets from raw measurements in a sensor network setting and in an RFID tracking experiment.


Parameter Expanded Variational Bayesian Methods

Neural Information Processing Systems

Bayesian inference has become increasingly important in statistical machine learning. Exact Bayesian calculations are often not feasible in practice, however. A number of approximate Bayesian methods have been proposed to make such calculations practical, among them the variational Bayesian (VB) approach. The VB approach, while useful, can nevertheless suffer from slow convergence to the approximate solution. To address this problem, we propose Parameter-eXpanded Variational Bayesian (PX-VB) methods to speed up VB. The new algorithm is inspired by parameter-expanded expectation maximization (PX-EM) and parameterexpanded data augmentation (PX-DA). Similar to PX-EM and -DA, PX-VB expands a model with auxiliary variables to reduce the coupling between variables in the original model. We analyze the convergence rates of VB and PX-VB and demonstrate the superior convergence rates of PX-VB in variational probit regression and automatic relevance determination.


Game Theoretic Algorithms for Protein-DNA binding

Neural Information Processing Systems

We develop and analyze game-theoretic algorithms for predicting coordinate binding of multiple DNA binding regulators. The allocation of proteins to local neighborhoods and to sites is carried out with resource constraints while explicating competing and coordinate binding relations among proteins with affinity to the site or region. The focus of this paper is on mathematical foundations of the approach. We also briefly demonstrate the approach in the context of the λ-phage switch.


Blind source separation for over-determined delayed mixtures

Neural Information Processing Systems

Blind source separation, i.e. the extraction of unknown sources from a set of given signals, is relevant for many applications. A special case of this problem is dimension reduction, where the goal is to approximate a given set of signals by superpositions of a minimal number of sources. Since in this case the signals outnumber the sources the problem is over-determined. Most popular approaches for addressing this problem are based on purely linear mixing models. However, many applications like the modeling of acoustic signals, EMG signals, or movement trajectories, require temporal shift-invariance of the extracted components. This case has only rarely been treated in the computational literature, and specifically for the case of dimension reduction almost no algorithms have been proposed. We present a new algorithm for the solution of this problem, which is based on a timefrequency transformation (Wigner-Ville distribution) of the generative model. We show that this algorithm outperforms classical source separation algorithms for linear mixtures, and also a related method for mixtures with delays. In addition, applying the new algorithm to trajectories of human gaits, we demonstrate that it is suitable for the extraction of spatiotemporal components that are easier to interpret than components extracted with other classical algorithms.


Temporal dynamics of information content carried by neurons in the primary visual cortex

Neural Information Processing Systems

We use multi-electrode recordings from cat primary visual cortex and investigate whether a simple linear classifier can extract information about the presented stimuli. We find that information is extractable and that it even lasts for several hundred milliseconds after the stimulus has been removed. In a fast sequence of stimulus presentation, information about both new and old stimuli is present simultaneously and nonlinear relations between these stimuli can be extracted. These results suggest nonlinear properties of cortical representations. The important implications of these properties for the nonlinear brain theory are discussed.


Modeling Dyadic Data with Binary Latent Factors

Neural Information Processing Systems

We introduce binary matrix factorization, a novel model for unsupervised matrix decomposition. The decomposition is learned by fitting a nonparametric Bayesian probabilistic model with binary latent variables to a matrix of dyadic data. Unlike bi-clustering models, which assign each row or column to a single cluster based on a categorical hidden feature, our binary feature model reflects the prior belief that items and attributes can be associated with more than one latent cluster at a time. We provide simple learning and inference rules for this new model and show how to extend it to an infinite model in which the number of features is not a priori fixed but is allowed to grow with the size of the data.


Isotonic Conditional Random Fields and Local Sentiment Flow

Neural Information Processing Systems

We examine the problem of predicting local sentiment flow in documents, and its application to several areas of text analysis. Formally, the problem is stated as predicting an ordinal sequence based on a sequence of word sets. In the spirit of isotonic regression, we develop a variant of conditional random fields that is wellsuited to handle this problem. Using the Möbius transform, we express the model as a simple convex optimization problem. Experiments demonstrate the model and its applications to sentiment prediction, style analysis, and text summarization.