Country
Stable Fitted Reinforcement Learning
We describe the reinforcement learning problem, motivate algorithms whichseek an approximation to the Q function, and present new convergence results for two such algorithms. 1 INTRODUCTION AND BACKGROUND Imagine an agent acting in some environment. At time t, the environment is in some state Xt chosen from a finite set of states. The agent perceives Xt, and is allowed to choose an action at from some finite set of actions. Meanwhile, the agent experiences a real-valued cost Ct, chosen from a distribution which also depends only on Xt and at and which has finite mean and variance. Such an environment is called a Markov decision process, or MDP.
Competence Acquisition in an Autonomous Mobile Robot using Hardware Neural Techniques
Jackson, Geoffrey B., Murray, Alan F.
In this paper we examine the practical use of hardware neural networks in an autonomous mobile robot. We have developed a hardware neural system based around a custom VLSI chip, EP SILON III, designed specifically for embedded hardware neural applications. We present here a demonstration application of an autonomous mobile robot that highlights the flexibility of this system.
Improving Elevator Performance Using Reinforcement Learning
Crites, Robert H., Barto, Andrew G.
This paper describes the application of reinforcement learning (RL) to the difficult real world problem of elevator dispatching. The elevator domainposes a combination of challenges not seen in most RL research to date. Elevator systems operate in continuous state spaces and in continuous time as discrete event dynamic systems. Their states are not fully observable and they are nonstationary due to changing passenger arrival rates. In addition, we use a team of RL agents, each of which is responsible for controlling one elevator car.The team receives a global reinforcement signal which appears noisy to each agent due to the effects of the actions of the other agents, the random nature of the arrivals and the incomplete observation of the state.
Neural Control for Nonlinear Dynamic Systems
Yu, Ssu-Hsin, Annaswamy, Anuradha M.
A neural network based approach is presented for controlling two distinct types of nonlinear systems. The first corresponds to nonlinear systems with parametric uncertainties where the parameters occur nonlinearly. The second corresponds to systems for which stabilizing control structures cannotbe determined. The proposed neural controllers are shown to result in closed-loop system stability under certain conditions.
Learning Fine Motion by Markov Mixtures of Experts
Meila, Marina, Jordan, Michael I.
Eng. and Computer Sci. Massachussetts Inst. of Technology Cambridge, MA 02139 mmp@ai.mit.edu Abstract Compliant control is a standard method for performing fine manipulation tasks,like grasping and assembly, but it requires estimation of the state of contact (s.o.c.) between the robot arm and the objects involved.Here we present a method to learn a model of the movement from measured data. The method requires little or no prior knowledge and the resulting model explicitly estimates the s.o.c. The current s.o.c. is viewed as the hidden state variable of a discrete HMM. The control dependent transition probabilities between states are modeled as parametrized functions of the measurement.
Parallel Optimization of Motion Controllers via Policy Iteration
Jr., Jefferson A. Coelho, Sitaraman, R., Grupen, Roderic A.
This paper describes a policy iteration algorithm for optimizing the performance of a harmonic function-based controller with respect to a user-defined index. Value functions are represented as potential distributionsover the problem domain, being control policies represented as gradient fields over the same domain. All intermediate policiesare intrinsically safe, i.e. collisions are not promoted during the adaptation process. The algorithm has efficient implementation inparallel SIMD architectures. One potential application - travel distance minimization - illustrates its usefulness.
A Dynamical Systems Approach for a Learnable Autonomous Robot
This paper discusses how a robot can learn goal-directed navigation tasksusing local sensory inputs. The emphasis is that such learning tasks could be formulated as an embedding problem of dynamical systems: desired trajectories in a task space should be embedded into an adequate sensory-based internal state space so that an unique mapping from the internal state space to the motor command could be established. The paper shows that a recurrent neural network suffices in self-organizing such an adequate internal state space from the temporal sensory input.
Experiments with Neural Networks for Real Time Implementation of Control
Campbell, Peter K., Dale, Michael, Ferrá, Herman L., Kowalczyk, Adam
This paper describes a neural network based controller for allocating capacity in a telecommunications network. This system was proposed in order to overcome a "real time" response constraint. Two basic architectures are evaluated: 1) a feedforward network-heuristic and; 2) a feedforward network-recurrent network. These architectures are compared against a linear programming (LP) optimiser as a benchmark. This LP optimiser was also used as a teacher to label the data samples for the feedforward neural network training algorithm. It is found that the systems are able to provide a traffic throughput of 99% and 95%, respectively, of the throughput obtained by the linear programming solution. Once trained, the neural network based solutions are found in a fraction of the time required by the LP optimiser.
Stock Selection via Nonlinear Multi-Factor Models
This paper discusses the use of multilayer feedforward neural networks forpredicting a stock's excess return based on its exposure to various technical and fundamental factors. To demonstrate the effectiveness of the approach a hedged portfolio which consists of equally capitalized long and short positions is constructed and its historical returns are benchmarked against T-bill returns and the S&P500 index. 1 Introduction