Asia
An Improved Policy Iteration Algorithm for Partially Observable MDPs
A new policy iteration algorithm for partially observable Markov decision processes is presented that is simpler and more efficient than an earlier policy iteration algorithm of Sondik (1971,1978). The key simplification is representation of a policy as a finite-state controller. This representation makes policy evaluation straightforward. The paper's contribution is to show that the dynamic-programming update used in the policy improvement step can be interpreted as the transformation of a finite-state controller into an improved finite-state controller. The new algorithm consistently outperforms value iteration as an approach to solving infinite-horizon problems.
Nonparametric Model-Based Reinforcement Learning
This paper describes some of the interactions of model learning algorithms and planning algorithms we have found in exploring model-based reinforcement learning. The paper focuses on how local trajectory optimizers can be used effectively with learned nonparametric models. We find that trajectory planners that are fully consistent with the learned model often have difficulty finding reasonable plans in the early stages of learning. Trajectory planners that balance obeying the learned model with minimizing cost (or maximizing reward) often do better, even if the plan is not fully consistent with the learned model.
Generalized Prioritized Sweeping
Andre, David, Friedman, Nir, Parr, Ronald
Prioritized sweeping is a model-based reinforcement learning method that attempts to focus an agent's limited computational resources to achieve a good estimate of the value of environment states. To choose effectively where to spend a costly planning step, classic prioritized sweeping uses a simple heuristic to focus computation on the states that are likely to have the largest errors. In this paper, we introduce generalized prioritized sweeping, a principled method for generating such estimates in a representation-specific manner. This allows us to extend prioritized sweeping beyond an explicit, state-based representation to deal with compact representations that are necessary for dealing with large state spaces. We apply this method for generalized model approximators (such as Bayesian networks), and describe preliminary experiments that compare our approach with classical prioritized sweeping.
Modelling Seasonality and Trends in Daily Rainfall Data
This paper presents a new approach to the problem of modelling daily rainfall using neural networks. We first model the conditional distributions of rainfall amounts, in such a way that the model itself determines the order of the process, and the time-dependent shape and scale of the conditional distributions. After integrating over particular weather patterns, we are able to extract seasonal variations and long-term trends. 1 Introduction Analysis of rainfall data is important for many agricultural, ecological and engineering activities. Design of irrigation and drainage systems, for instance, needs to take account not only of mean expected rainfall, but also of rainfall volatility. Estimates of crop yields also depend on the distribution of rainfall during the growing season, as well as on the overall amount.
Enhancing Q-Learning for Optimal Asset Allocation
This paper enhances the Q-Iearning algorithm for optimal asset allocation proposed in (Neuneier, 1996 [6]). The new formulation simplifies the approach by using only one value-function for many assets and allows model-free policy-iteration. After testing the new algorithm on real data, the possibility of risk management within the framework of Markov decision problems is analyzed. The proposed methods allows the construction of a multi-period portfolio management system which takes into account transaction costs, the risk preferences of the investor, and several constraints on the allocation. 1 Introduction
A Generic Approach for Identification of Event Related Brain Potentials via a Competitive Neural Network Structure
Lange, Daniel H., Siegelmann, Hava T., Pratt, Hillel, Inbar, Gideon F.
We present a novel generic approach to the problem of Event Related Potential identification and classification, based on a competitive N eural Net architecture. The network weights converge to the embedded signal patterns, resulting in the formation of a matched filter bank. The network performance is analyzed via a simulation study, exploring identification robustness under low SNR conditions and compared to the expected performance from an information theoretic perspective. The classifier is applied to real event-related potential data recorded during a classic oddball type paradigm; for the first time, withinsession variable signal patterns are automatically identified, dismissing the strong and limiting requirement of a-priori stimulus-related selective grouping of the recorded data.
MELONET I: Neural Nets for Inventing Baroque-Style Chorale Variations
The investigation of neural information structures in music is a rather new, exciting research area bringing together different disciplines such as computer science, mathematics, musicology and cognitive science. One of its aims is to find out what determines the personal style of a composer. It has been shown that neural network models - better than other AI approaches - are able to learn and reproduce styledependent features from given examples, e.g., chorale harmonizations in the style of Johann Sebastian Bach (Hild et al., 1992). However when dealing with melodic sequences, e.g., folksong style melodies, all of these models have considerable difficulties to learn even simple structures. The reason is that they are unable to capture high-order structure such as harmonies, motifs and phrases simultaneously occurring at multiple time scales.
Multiresolution Tangent Distance for Affine-invariant Classification
Vasconcelos, Nuno, Lippman, Andrew
The ability to rely on similarity metrics invariant to image transformations is an important issue for image classification tasks such as face or character recognition. We analyze an invariant metric that has performed well for the latter - the tangent distance - and study its limitations when applied to regular images, showing that the most significant among these (convergence to local minima) can be drastically reduced by computing the distance in a multiresolution setting. This leads to the multi resolution tangent distance, which exhibits significantly higher invariance to image transformations, and can be easily combined with robust estimation procedures.
Self-similarity Properties of Natural Images
Turiel, Antonio, Mato, Germán, Parga, Néstor, Nadal, Jean-Pierre
Scale invariance is a fundamental property of ensembles of natural images [1]. Their non Gaussian properties [15, 16] are less well understood, but they indicate the existence of a rich statistical structure. In this work we present a detailed study of the marginal statistics of a variable related to the edges in the images. A numerical analysis shows that it exhibits extended self-similarity [3, 4, 5]. This is a scaling property stronger than self-similarity: all its moments can be expressed as a power of any given moment. More interesting, all the exponents can be predicted in terms of a multiplicative log-Poisson process. This is the very same model that was used very recently to predict the correct exponents of the structure functions of turbulent flows [6]. These results allow us to study the underlying multifractal singularities. In particular we find that the most singular structures are one-dimensional: the most singular manifold consists of sharp edges.
Recovering Perspective Pose with a Dual Step EM Algorithm
Cross, Andrew D. J., Hancock, Edwin R.
This paper describes a new approach to extracting 3D perspective structure from 2D point-sets. The novel feature is to unify the tasks of estimating transformation geometry and identifying pointcorrespondence matches. Unification is realised by constructing a mixture model over the bipartite graph representing the correspondence match and by effecting optimisation using the EM algorithm. According to our EM framework the probabilities of structural correspondence gate contributions to the expected likelihood function used to estimate maximum likelihood perspective pose parameters. This provides a means of rejecting structural outliers.