Asia
Nonparametric Transforms of Graph Kernels for Semi-Supervised Learning
Zhu, Jerry, Kandola, Jaz, Ghahramani, Zoubin, Lafferty, John D.
We present an algorithm based on convex optimization for constructing kernels for semi-supervised learning. The kernel matrices are derived from the spectral decomposition of graph Laplacians, and combine labeled and unlabeled data in a systematic fashion. Unlike previous work using diffusion kernels and Gaussian random field kernels, a nonparametric kernel approach is presented that incorporates order constraints during optimization. This results in flexible kernels and avoids the need to choose among different parametric forms. Our approach relies on a quadratically constrained quadratic program (QCQP), and is computationally feasible for large datasets. We evaluate the kernels on real datasets using support vector machines, with encouraging results.
Experts in a Markov Decision Process
Even-dar, Eyal, Kakade, Sham M., Mansour, Yishay
We consider an MDP setting in which the reward function is allowed to change during each time step of play (possibly in an adversarial manner), yet the dynamics remain fixed. Similar to the experts setting, we address the question of how well can an agent do when compared to the reward achieved under the best stationary policy over time. We provide efficient algorithms, which have regret bounds with no dependence on the size of state space. Instead, these bounds depend only on a certain horizon time of the process and logarithmically on the number of actions. We also show that in the case that the dynamics change over time, the problem becomes computationally hard.
Self-Tuning Spectral Clustering
Zelnik-manor, Lihi, Perona, Pietro
We study a number of open issues in spectral clustering: (i) Selecting the appropriate scale of analysis, (ii) Handling multi-scale data, (iii) Clustering withirregular background clutter, and, (iv) Finding automatically the number of groups. We first propose that a'local' scale should be used to compute the affinity between each pair of points. This local scaling leads to better clustering especially when the data includes multiple scales and when the clusters are placed within a cluttered background. We further suggest exploiting the structure of the eigenvectors to infer automatically the number of groups. This leads to a new algorithm in which the final randomly initialized k-means stage is eliminated.
Generalization Error Bounds for Collaborative Prediction with Low-Rank Matrices
Srebro, Nathan, Alon, Noga, Jaakkola, Tommi S.
We prove generalization error bounds for predicting entries in a partially observed matrix by fitting the observed entries with a low-rank matrix. In justifying the analysis approach we take to obtain the bounds, we present an example of a class of functions of finite pseudodimension such that the sums of functions from this class have unbounded pseudodimension.
Using the Equivalent Kernel to Understand Gaussian Process Regression
Sollich, Peter, Williams, Christopher
The equivalent kernel [1] is a way of understanding how Gaussian process regressionworks for large sample sizes based on a continuum limit. In this paper we show (1) how to approximate the equivalent kernel of the widely-used squared exponential (or Gaussian) kernel and related kernels, and(2) how analysis using the equivalent kernel helps to understand the learning curves for Gaussian processes.
Semi-supervised Learning via Gaussian Processes
Lawrence, Neil D., Jordan, Michael I.
We present a probabilistic approach to learning a Gaussian Process classifier in the presence of unlabeled data. Our approach involves a "null category noise model" (NCNM) inspired by ordered categorical noisemodels. The noise model reflects an assumption that the data density is lower between the class-conditional densities. We illustrate our approach on a toy problem and present comparative resultsfor the semi-supervised classification of handwritten digits.
Economic Properties of Social Networks
Kakade, Sham M., Kearns, Michael, Ortiz, Luis E., Pemantle, Robin, Suri, Siddharth
We examine the marriage of recent probabilistic generative models for social networks with classical frameworks from mathematical economics. Weare particularly interested in how the statistical structure of such networks influences global economic quantities such as price variation. Ourfindings are a mixture of formal analysis, simulation, and experiments on an international trade data set from the United Nations.
The Cerebellum Chip: an Analog VLSI Implementation of a Cerebellar Model of Classical Conditioning
Hofstoetter, Constanze, Gil, Manuel, Eng, Kynan, Indiveri, Giacomo, Mintz, Matti, Kramer, Jörg, Verschure, Paul F.
We present a biophysically constrained cerebellar model of classical conditioning, implemented using a neuromorphic analog VLSI (aVLSI) chip. Like its biological counterpart, our cerebellar model is able to control adaptive behavior by predicting the precise timing of events. Here we describe the functionality of the chip and present its learning performance, as evaluated in simulated conditioning experiments at the circuit level and in behavioral experiments using a mobile robot. We show that this aVLSI model supports the acquisition and extinction of adaptively timed conditioned responses under real-world conditions with ultra-low power consumption.
Modeling Conversational Dynamics as a Mixed-Memory Markov Process
Choudhury, Tanzeem, Basu, Sumit
In this work, we quantitatively investigate the ways in which a given person influences the joint turn-taking behavior in a conversation. After collecting an auditory database of social interactions among a group of twenty-three people via wearable sensors (66 hours of data each over two weeks), we apply speech and conversation detection methods to the auditory streams. These methods automatically locate the conversations, determine their participants, and mark which participant was speaking when. We then model the joint turn-taking behavior as a Mixed-Memory Markov Model [1] that combines the statistics of the individual subjects' self-transitions and the partners' cross-transitions. The mixture parameters in this model describe how much each person's individual behavior contributes to the joint turn-taking behavior of the pair.