Undirected Networks
Value Function Approximation with Diffusion Wavelets and Laplacian Eigenfunctions
Mahadevan, Sridhar, Maggioni, Mauro
We investigate the problem of automatically constructing efficient representations orbasis functions for approximating value functions based on analyzing the structure and topology of the state space. In particular, twonovel approaches to value function approximation are explored based on automatically constructing basis functions on state spaces that can be represented as graphs or manifolds: one approach uses the eigenfunctions ofthe Laplacian, in effect performing a global Fourier analysis on the graph; the second approach is based on diffusion wavelets, which generalize classical wavelets to graphs using multiscale dilations induced by powers of a diffusion operator or random walk on the graph. Together, these approaches form the foundation of a new generation of methods for solving large Markov decision processes, in which the underlying representation andpolicies are simultaneously learned.
Convergence and Consistency of Regularized Boosting Algorithms with Stationary B-Mixing Observations
Lozano, Aurelie C., Kulkarni, Sanjeev R., Schapire, Robert E.
We study the statistical convergence and consistency of regularized Boosting methods, where the samples are not independent and identically distributed(i.i.d.) but come from empirical processes of stationary ฮฒ-mixing sequences. Utilizing a technique that constructs a sequence of independent blocks close in distribution to the original samples, we prove the consistency of the composite classifiers resulting from a regularization achievedby restricting the 1-norm of the base classifiers' weights. When compared to the i.i.d.
Efficient Estimation of OOMs
Jaeger, Herbert, Zhao, Mingjie, Kolling, Andreas
A standard method to obtain stochastic models for symbolic time series is to train state-emitting hidden Markov models (SE-HMMs) with the Baum-Welch algorithm. Based on observable operator models (OOMs), in the last few months a number of novel learning algorithms for similar purposeshave been developed: (1,2) two versions of an "efficiency sharpening" (ES) algorithm, which iteratively improves the statistical efficiency ofa sequence of OOM estimators, (3) a constrained gradient descent ML estimator for transition-emitting HMMs (TE-HMMs). We give an overview on these algorithms and compare them with SE-HMM/EM learning on synthetic and real-life data.
An Application of Markov Random Fields to Range Sensing
Diebel, James, Thrun, Sebastian
This paper describes a highly successful application of MRFs to the problem ofgenerating high-resolution range images. A new generation of range sensors combines the capture of low-resolution range images with the acquisition of registered high-resolution camera images. The MRF in this paper exploits the fact that discontinuities in range and coloring tend to co-align. This enables it to generate high-resolution, low-noise range images by integrating regular camera images into the range data. We show that by using such an MRF, we can substantially improve over existing range imaging technology.
Efficient estimation of hidden state dynamics from spike trains
Danoczy, Marton G., Hahnloser, Richard H. R.
Neurons can have rapidly changing spike train statistics dictated by the underlying network excitability or behavioural state of an animal. To estimate the time course of such state dynamics from single-or multiple neuronrecordings, we have developed an algorithm that maximizes the likelihood of observed spike trains by optimizing the state lifetimes and the state-conditional interspike-interval (ISI) distributions. Our nonparametric algorithmis free of time-binning and spike-counting problems and has the computational complexity of a Mixed-state Markov Model operating on a state sequence of length equal to the total number ofrecorded spikes. As an example, we fit a two-state model to paired recordings of premotor neurons in the sleeping songbird. We find that the two state-conditional ISI functions are highly similar to the ones measured duringwaking and singing, respectively.
On Local Rewards and Scaling Distributed Reinforcement Learning
We consider the scaling of the number of examples necessary to achieve good performance in distributed, cooperative, multi-agent reinforcement learning, as a function of the the number of agents n. We prove a worstcase lowerbound showing that algorithms that rely solely on a global reward signal to learn policies confront a fundamental limit: They require anumber of real-world examples that scales roughly linearly in the number of agents. For settings of interest with a very large number of agents, this is impractical. We demonstrate, however, that there is a class of algorithms that, by taking advantage of local reward signals in large distributed Markov Decision Processes, are able to ensure good performance witha number of samples that scales as O(log n). This makes them applicable even in settings with a very large number of agents n.
Maximum Margin Semi-Supervised Learning for Structured Variables
Altun, Y., McAllester, D., Belkin, M.
Many real-world classification problems involve the prediction of multiple interdependent variables forming some structural dependency. Recentprogress in machine learning has mainly focused on supervised classification of such structured variables. In this paper, we investigate structured classification in a semi-supervised setting. We present a discriminative approach that utilizes the intrinsic geometry ofinput patterns revealed by unlabeled data points and we derive a maximum-margin formulation of semi-supervised learning for structured variables.