Goto

Collaborating Authors

 Technology



Autonomous Helicopter Flight via Reinforcement Learning

Neural Information Processing Systems

Autonomous helicopter flight represents a challenging control problem, with complex, noisy, dynamics. In this paper, we describe a successful application of reinforcement learning to autonomous helicopter flight.


An MDP-Based Approach to Online Mechanism Design

Neural Information Processing Systems

Online mechanism design (MD) considers the problem of providing incentivesto implement desired system-wide outcomes in systems withself-interested agents that arrive and depart dynamically. Agentscan choose to misrepresent their arrival and departure times, in addition to information about their value for different outcomes. We consider the problem of maximizing the total longterm valueof the system despite the self-interest of agents. The online MD problem induces a Markov Decision Process (MDP), which when solved can be used to implement optimal policies in a truth-revealing Bayesian-Nash equilibrium.


Approximate Planning in POMDPs with Macro-Actions

Neural Information Processing Systems

Recent research has demonstrated that useful POMDP solutions do not require consideration of the entire belief space. We extend this idea with the notion of temporal abstraction. We present and explore a new reinforcement learningalgorithm over grid-points in belief space, which uses macro-actions and Monte Carlo updates of the Q-values. We apply the algorithm to a large scale robot navigation task and demonstrate that with temporal abstraction we can consider an even smaller part of the belief space, we can learn POMDP policies faster, and we can do information gathering more efficiently.


ARA*: Anytime A* with Provable Bounds on Sub-Optimality

Neural Information Processing Systems

In real world planning problems, time for deliberation is often limited. Anytime planners are well suited for these problems: they find a feasible solutionquickly and then continually work on improving it until time runs out. In this paper we propose an anytime heuristic search, ARA*, which tunes its performance bound based on available search time. It starts by finding a suboptimal solution quickly using a loose bound, then tightens the bound progressively as time allows. Given enough time it finds a provably optimal solution. While improving its bound, ARA* reuses previous search efforts and, as a result, is significantly more efficient thanother anytime search methods. In addition to our theoretical analysis, we demonstrate the practical utility of ARA* with experiments on a simulated robot kinematic arm and a dynamic path planning problem foran outdoor rover.


Subject-Independent Magnetoencephalographic Source Localization by a Multilayer Perceptron

Neural Information Processing Systems

We describe a system that localizes a single dipole to reasonable accuracy fromnoisy magnetoencephalographic (MEG) measurements in real time. At its core is a multilayer perceptron (MLP) trained to map sensor signalsand head position to dipole location. Including head position overcomes the previous need to retrain the MLP for each subject and session. Thetraining dataset was generated by mapping randomly chosen dipoles and head positions through an analytic model and adding noise from real MEG recordings. After training, a localization took 0.7 ms with an average error of 0.90 cm. A few iterations of a Levenberg-Marquardt routine using the MLP's output as its initial guess took 15 ms and improved theaccuracy to 0.53 cm, only slightly above the statistical limits on accuracy imposed by the noise. We applied these methods to localize single dipole sources from MEG components isolated by blind source separation and compared the estimated locations to those generated by standard manually-assisted commercial software.


Increase Information Transfer Rates in BCI by CSP Extension to Multi-class

Neural Information Processing Systems

Brain-Computer Interfaces (BCI) are an interesting emerging technology that is driven by the motivation to develop an effective communication interface translatinghuman intentions into a control signal for devices like computers or neuroprostheses. If this can be done bypassing the usual human outputpathways like peripheral nerves and muscles it can ultimately become a valuable tool for paralyzed patients.


Impact of an Energy Normalization Transform on the Performance of the LF-ASD Brain Computer Interface

Neural Information Processing Systems

This paper presents an energy normalization transform as a method to reduce system errors in the LF-ASD brain-computer interface. The energy normalization transform has two major benefits to the system performance. First, it can increase class separation between the active and idle EEG data.



Training fMRI Classifiers to Detect Cognitive States across Multiple Human Subjects

Neural Information Processing Systems

We consider learning to classify cognitive states of human subjects, based on their brain activity observed via functional Magnetic Resonance Imaging (fMRI). This problem is important because such classifiers constitute "virtualsensors" of hidden cognitive states, which may be useful in cognitive science research and clinical applications. In recent work, Mitchell, et al. [6,7,9] have demonstrated the feasibility of training such classifiers for individual human subjects (e.g., to distinguish whether the subject is reading an ambiguous or unambiguous sentence, or whether they are reading a noun or a verb). Here we extend that line of research, exploring how to train classifiers that can be applied across multiple human subjects,including subjects who were not involved in training the classifier. We describe the design of several machine learning approaches to training multiple-subject classifiers, and report experimental results demonstrating the success of these methods in learning cross-subject classifiers for two different fMRI data sets.