Learning Graphical Models
Fast Rates to Bayes for Kernel Machines
Steinwart, Ingo, Scovel, Clint
We establish learning rates to the Bayes risk for support vector machines (SVMs) with hinge loss. In particular, for SVMs with Gaussian RBF kernels we propose a geometric condition for distributions which can be used to determine approximation properties of these kernels. Finally, we compare our methods with a recent paper of G. Blanchard et al..
Dynamic Bayesian Networks for Brain-Computer Interfaces
Shenoy, Pradeep, Rao, Rajesh P.
We describe an approach to building brain-computer interfaces (BCI) based on graphical models for probabilistic inference and learning. We show how a dynamic Bayesian network (DBN) can be used to infer probability distributions over brain-and body-states during planning and execution of actions. The DBN is learned directly from observed data and allows measured signals such as EEG and EMG to be interpreted in terms of internal states such as intent to move, preparatory activity, and movement execution. Unlike traditional classification-based approaches to BCI, the proposed approach (1) allows continuous tracking and prediction ofinternal states over time, and (2) generates control signals based on an entire probability distribution over states rather than binary yes/no decisions. We present preliminary results of brain-and body-state estimation usingsimultaneous EEG and EMG signals recorded during a self-paced left/right hand movement task.
Resolving Perceptual Aliasing In The Presence Of Noisy Sensors
Agents learning to act in a partially observable domain may need to overcome the problem of perceptual aliasing - i.e., different states that appear similar but require different responses. This problem is exacerbated whenthe agent's sensors are noisy, i.e., sensors may produce different observationsin the same state. We show that many well-known reinforcement learning methods designed to deal with perceptual aliasing, suchas Utile Suffix Memory, finite size history windows, eligibility traces, and memory bits, do not handle noisy sensors well. We suggest a new algorithm, Noisy Utile Suffix Memory (NUSM), based on USM, that uses a weighted classification of observed trajectories. We compare NUSM to the above methods and show it to be more robust to noise.
Coarticulation in Markov Decision Processes
Rohanimanesh, Khashayar, Platt, Robert, Mahadevan, Sridhar, Grupen, Roderic
We investigate an approach for simultaneously committing to multiple activities,each modeled as a temporally extended action in a semi-Markov decision process (SMDP). For each activity we define aset of admissible solutions consisting of the redundant set of optimal policies, and those policies that ascend the optimal statevalue functionassociated with them. A plan is then generated by merging them in such a way that the solutions to the subordinate activities are realized in the set of admissible solutions satisfying the superior activities.
Conditional Random Fields for Object Recognition
Quattoni, Ariadna, Collins, Michael, Darrell, Trevor
We present a discriminative part-based approach for the recognition of object classes from unsegmented cluttered scenes. Objects are modeled as flexible constellations of parts conditioned on local observations found by an interest operator. For each object class the probability of a given assignment of parts to local features is modeled by a Conditional Random Field(CRF). We propose an extension of the CRF framework that incorporates hidden variables and combines class conditional CRFs into a unified framework for part-based object recognition. The parameters of the CRF are estimated in a maximum likelihood framework and recognition proceedsby finding the most likely class under our model. The main advantage of the proposed CRF framework is that it allows us to relax the assumption of conditional independence of the observed data (i.e.
VDCBPI: an Approximate Scalable Algorithm for Large POMDPs
Poupart, Pascal, Boutilier, Craig
Existing algorithms for discrete partially observable Markov decision processes can at best solve problems of a few thousand states due to two important sources of intractability: the curse of dimensionality and the policy space complexity. This paper describes a new algorithm (VDCBPI) that mitigates both sources of intractability by combining the V alue Directed Compression (VDC) technique [13] with Bounded Policy Iteration (BPI) [14]. The scalability of VDCBPI is demonstrated on synthetic network management problems with up to 33 million states. 1 Introduction Partially observable Markov decision processes (POMDPs) provide a natural and expressive framework for decision making, but their use in practice has been limited by the lack of scalable solution algorithms. T wo important sources of intractability plague discrete model-based POMDPs: high dimensionality of belief space, and the complexity of policy or value function (VF) space. Classic solution algorithms [4, 10, 7], for example, compute value functions represented by exponentially many value vectors, each of exponential size. As a result, they can only solve POMDPs with on the order of 100 states.
Approximately Efficient Online Mechanism Design
Parkes, David C., Yanovsky, Dimah, Singh, Satinder P.
Online mechanism design (OMD) addresses the problem of sequential decision making in a stochastic environment with multiple self-interested agents. The goal in OMD is to make value-maximizing decisions despite this self-interest. In previous work we presented a Markov decision process (MDP)-basedapproach to OMD in large-scale problem domains. In practice the underlying MDP needed to solve OMD is too large and hence the mechanism must consider approximations. This raises the possibility thatagents may be able to exploit the approximation for selfish gain. We adopt sparse-sampling-based MDP algorithms to implement ɛ- efficient policies, and retain truth-revelation as an approximate Bayesian-Nash equilibrium. Our approach is empirically illustrated in the context of the dynamic allocation of WiFi connectivity to users in a coffeehouse.
Conditional Models of Identity Uncertainty with Application to Noun Coreference
McCallum, Andrew, Wellner, Ben
Coreference analysis, also known as record linkage or identity uncertainty, isa difficult and important problem in natural language processing, databases, citation matching and many other tasks. This paper introduces severaldiscriminative, conditional-probability models for coreference analysis,all examples of undirected graphical models. Unlike many historical approaches to coreference, the models presented here are relational--they do not assume that pairwise coreference decisions should be made independently from each other. Unlike other relational models of coreference that are generative, the conditional model here can incorporate a great variety of features of the input without having to be concerned about their dependencies--paralleling the advantages of conditional randomfields over hidden Markov models.