AITopics

We propose and analyze an algorithm that approximates solutions to the problem of optimal stopping in a discounted irreducible aperiodic Markov chain. The scheme involves the use of linear combinations of fixed basis functions to approximate a Q-function. The weights of the linear combination are incrementally updated through an iterative process similar to Q-Iearning, involving simulation of the underlying Markov chain. Due to space limitations, we only provide an overview of a proof of convergence (with probability 1) and bounds on the approximation error. This is the first theoretical result that establishes the soundness of a Q-Iearninglike algorithm when combined with arbitrary linear function approximators to solve a sequential decision problem.

algorithm, function approximator, value function, (15 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.57)

Learning from Demonstration

Schaal, Stefan

By now it is widely accepted that learning a task from scratch, i.e., without any prior knowledge, is a daunting undertaking. Humans, however, rarely attempt to learn from scratch. They extract initial biases as well as strategies how to approach a learning problem from instructions and/or demonstrations of other humans. For learning control, this paper investigates how learning from demonstration can be applied in the context of reinforcement learning. We consider priming the Q-function, the value function, the policy, and the model of the task dynamics as possible areas where demonstrations can speed up learning. In general nonlinear learning problems, only model-based reinforcement learning shows significant speedup after a demonstration, while in the special case of linear quadratic regulator (LQR) problems, all methods profit from the demonstration. In an implementation of pole balancing on a complex anthropomorphic robot arm, we demonstrate that, when facing the complexities of real signal processing, model-based reinforcement learning offers the most robustness for LQR problems. Using the suggested methods, the robot learns pole balancing in just a single trial after a 30 second long demonstration of the human instructor.

demonstration, learning, reinforcement, (15 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(4 more...)

Genre: Research Report (0.68)

Industry: Education > Focused Education > Special Education (0.44)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Efficient Nonlinear Control with Actor-Tutor Architecture

Doya, Kenji

A new reinforcement learning architecture for nonlinear control is proposed. A direct feedback controller, or the actor, is trained by a value-gradient based controller, or the tutor. This architecture enables both efficient use of the value function and simple computation for real-time implementation. Good performance was verified in multidimensional nonlinear control tasks using Gaussian softmax networks.

architecture, controller, value function, (14 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.05)
North America > United States > New York (0.04)

Industry: Health & Medicine (0.72)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.91)

Wu, Lizhong, Moody, John E.

Multi-effect Decompositions for Financial Data Modeling

High frequency foreign exchange data can be decomposed into three components: the inventory effect component, the surprise infonnation (news) component and the regular infonnation component. The presence of the inventory effect and news can make analysis of trends due to the diffusion of infonnation (regular information component) difficult. We propose a neural-net-based, independent component analysis to separate high frequency foreign exchange data into these three components. Our empirical results show that our proposed multi-effect decomposition can reveal the intrinsic price behavior.

infonnation component, information component, multi-effect decomposition, (12 more...)

Country:

Europe > Switzerland > Zürich > Zürich (0.15)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
(4 more...)

Genre: Research Report (0.34)

Industry: Banking & Finance (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Interpolating Earth-science Data using RBF Networks and Mixtures of Experts

Wan, Ernest, Bone, Don

We present a mixture of experts (ME) approach to interpolate sparse, spatially correlated earth-science data. Kriging is an interpolation method which uses a global covariation model estimated from the data to take account of the spatial dependence in the data. Based on the close relationship between kriging and the radial basis function (RBF) network (Wan & Bone, 1996), we use a mixture of generalized RBF networks to partition the input space into statistically correlated regions and learn the local covariation model of the data in each region. Applying the ME approach to simulated and real-world data, we show that it is able to achieve good partitioning of the input space, learn the local covariation models and improve generalization.

covariation model, grbf network, local covariation model, (13 more...)

Country:

Asia > Middle East > Jordan (0.05)
Oceania > Australia > Australian Capital Territory > Canberra (0.05)
North America > United States > New York (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Leite, José A. F., Hancock, Edwin R.

Contour Organisation with the EM Algorithm

This paper describes how the early visual process of contour organisation can be realised using the EM algorithm. The underlying computational representation is based on fine spline coverings. According to our EM approach the adjustment of spline parameters draws on an iterative weighted least-squares fitting process. The expectation step of our EM procedure computes the likelihood of the data using a mixture model defined over the set of spline coverings. These splines are limited in their spatial extent using Gaussian windowing functions.

algorithm, probability, spline, (16 more...)

Country: Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.75)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Gray, Michael S., Pouget, Alexandre, Zemel, Richard S., Nowlan, Steven J., Sejnowski, Terrence J.

Selective Integration: A Model for Disparity Estimation

Local disparity information is often sparse and noisy, which creates two conflicting demands when estimating disparity in an image region: the need to spatially average to get an accurate estimate, and the problem of not averaging over discontinuities. We have developed a network model of disparity estimation based on disparityselective neurons, such as those found in the early stages of processing in visual cortex. The model can accurately estimate multiple disparities in a region, which may be caused by transparency or occlusion, in real images and random-dot stereograms. The use of a selection mechanism to selectively integrate reliable local disparity estimates results in superior performance compared to standard back-propagation and cross-correlation approaches. In addition, the representations learned with this selection mechanism are consistent with recent neurophysiological results of von der Heydt, Zhou, Friedman, and Poggio [8] for cells in cortical visual area V2. Combining multi-scale biologically-plausible image processing with the power of the mixture-of-experts learning algorithm represents a promising approach that yields both high performance and new insights into visual system function.

disparity, disparity threshold, local disparity network, (12 more...)

Country:

North America > United States > California > San Diego County > San Diego (0.05)
North America > United States > New York (0.05)
Asia > Middle East > Jordan (0.05)
North America > United States > California > San Diego County > La Jolla (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Spatiotemporal Coupling and Scaling of Natural Images and Human Visual Sensitivities

Dong, Dawei W.

We study the spatiotemporal correlation in natural time-varying images and explore the hypothesis that the visual system is concerned with the optimal coding of visual representation through spatiotemporal decorrelation of the input signal. Based on the measured spatiotemporal power spectrum, the transform needed to decorrelate input signal is derived analytically and then compared with the actual processing observed in psychophysical experiments.

power spectrum, sensitivity, spectrum, (12 more...)

Country:

Asia > Myanmar > Tanintharyi Region > Dawei (0.05)
North America > United States > New York (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (0.47)
Information Technology > Data Science (0.34)

Bienenstock, Elie, Geman, Stuart, Potter, Daniel

Compositionality, MDL Priors, and Object Recognition

Images are ambiguous at each of many levels of a contextual hierarchy. Nevertheless, the high-level interpretation of most scenes is unambiguous, as evidenced by the superior performance of humans. This observation argues for global vision models, such as deformable templates. Unfortunately, such models are computationally intractable for unconstrained problems. We propose a compositional model in which primitives are recursively composed, subject to syntactic restrictions, to form tree-structured objects and object groupings. Ambiguity is propagated up the hierarchy in the form of multiple interpretations, which are later resolved by a Bayesian, equivalently minimum-description-Iength, cost functional.

arrangement, binding energy, interpretation, (16 more...)

Country:

North America > United States > New York (0.05)
North America > United States > Rhode Island > Providence County > Providence (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (0.85)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Learning Temporally Persistent Hierarchical Representations

Becker, Suzanna

A biologically motivated model of cortical self-organization is proposed. Context is combined with bottom-up information via a maximum likelihood cost function. Clusters of one or more units are modulated by a common contextual gating Signal; they thereby organize themselves into mutually supportive predictors of abstract contextual features. The model was tested in its ability to discover viewpoint-invariant classes on a set of real image sequences of centered, gradually rotating faces. It performed considerably better than supervised back-propagation at generalizing to novel views from a small number of training examples.

information, learning, sequence, (14 more...)

Country:

North America > United States > California > San Mateo County > San Mateo (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Ontario > Hamilton (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.78)