AITopics

Partially Observable Markov Decision Processes (pO "MOPs) constitute an important class of reinforcement learning problems which present unique theoretical and computational difficulties. In the absence of the Markov property, popular reinforcement learning algorithms such as Q-Iearning may no longer be effective, and memory-based methods which remove partial observability via state-estimation are notoriously expensive. An alternative approach is to seek a stochastic memoryless policy which for each observation of the environment prescribes a probability distribution over available actions that maximizes the average reward per timestep. A reinforcement learning algorithm which learns a locally optimal stochastic memoryless policy has been proposed by Jaakkola, Singh and Jordan, but not empirically verified. We present a variation of this algorithm, discuss its implementation, and demonstrate its viability using four test problems.

algorithm, artificial intelligence, machine learning, (14 more...)

Country:

North America > United States (0.28)
Asia > Middle East > Jordan (0.25)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Meir, Ron, Maiorov, Vitaly

On the Optimality of Incremental Neural Network Algorithms

We study the approximation of functions by two-layer feedforward neural networks, focusing on incremental algorithms which greedily add units, estimating single unit parameters at each stage. As opposed to standard algorithms for fixed architectures, the optimization at each stage is performed over a small number of parameters, mitigating many of the difficult numerical problems inherent in high-dimensional nonlinear optimization. We establish upper bounds on the error incurred by the algorithm, when approximating functions from the Sobolev class, thereby extending previous results which only provided rates of convergence for functions in certain convex hulls of functional spaces. By comparing our results to recently derived lower bounds, we show that the greedy algorithms are nearly optimal. Combined with estimation error results for greedy algorithms, a strong case can be made for this type of approach.

approximation, artificial intelligence, neural network, (16 more...)

Country: Asia > Middle East > Israel (0.15)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Herschkowitz, Didier, Nadal, Jean-Pierre

Unsupervised and Supervised Clustering: The Mutual Information between Parameters and Observations

Recent works in parameter estimation and neural coding have demonstrated that optimal performance are related to the mutual information between parameters and data. We consider the mutual information in the case where the dependency in the parameter (a vector 8) of the conditional p.d.f. of each observation (a vector

artificial intelligence, machine learning, mutual information, (12 more...)

Country:

Asia (0.16)
North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.36)

Oyama, Eimei, Tachi, Susumu

Coordinate Transformation Learning of Hand Position Feedback Controller by Using Change of Position Error Norm

"goal-directed"; i.e., there is no direct way

equation, health & medicine, neural network, (12 more...)

Country: Asia > Japan > Honshū > Kantō (0.14)

Industry: Health & Medicine (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)

Rae, H. C., Sollich, Peter, Coolen, Anthony C. C.

On-Line Learning with Restricted Training Sets: Exact Solution as Benchmark for General Theories

Calculation of Q(t) and R(t) using (4, 5, 7, 9) to execute the path average and the average over sets is relatively straightforward, albeit tedious. We find that -"Yt(l -"Yt)

educational setting, neural network, vc-dimension, (15 more...)

Country: Asia (0.14)

Genre: Instructional Material > Online (0.50)

Industry: Education > Educational Setting > Online (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)

Bartlett, Peter L., Maiorov, Vitaly, Meir, Ron

Almost Linear VC Dimension Bounds for Piecewise Polynomial Networks

We compute upper and lower bounds on the VC dimension of feedforward networks of units with piecewise polynomial activation functions. We show that if the number of layers is fixed, then the VC dimension grows as W log W, where W is the number of parameters in the network. The VC dimension is an important measure of the complexity of a class of binaryvalued functions, since it characterizes the amount of data required for learning in the PAC setting (see [BEHW89, Vap82]). In this paper, we establish upper and lower bounds on the VC dimension of a specific class of multi-layered feedforward neural networks. Let F be the class of binary-valued functions computed by a feed forward neural network with W weights and k computational (non-input) units, each with a piecewise polynomial activation function.

artificial intelligence, computation unit, neural network, (13 more...)

Country:

Asia > Middle East > Israel (0.15)
Oceania > Australia (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Suematsu, Nobuo, Hayashi, Akira

A Reinforcement Learning Algorithm in Partially Observable Environments Using Short-Term Memory

We have proved that the model learned by BLHT converges to the optimal model in given hypothesis space, 1{, which provides the most accurate predictions of percepts and rewards, given short-term memory. We believe this fact provides a solid basis for BLHT, and BLHT can be compared favorably with other methods using short-term memory.

artificial intelligence, reinforcement learning, short-term memory, (16 more...)

Country: Asia > Japan (0.15)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Zhang, Liqing, Cichocki, Andrzej

Blind Separation of Filtered Sources Using State-Space Approach

In this paper we present a novel approach to multichannel blind separation/generalized deconvolution, assuming that both mixing and demixing models are described by stable linear state-space systems. Based on the minimization of Kullback-Leibler Divergence, we develop a novel learning algorithm to train the matrices in the output equation. To estimate the state of the demixing model, we introduce a new concept, called hidden innovation, to numerically implement the Kalman filter. Computer simulations are given to show the validity and high effectiveness of the state-space approach. The blind source separation problem is to recover independent sources from sensor outputs without assuming any priori knowledge of the original signals besides certain statistic features.

artificial intelligence, machine learning, matrix, (18 more...)

Country:

Asia > China (0.14)
North America > United States > Massachusetts (0.14)
Europe > Poland (0.14)
Asia > Japan (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.87)

Gdalyahu, Yoram, Weinshall, Daphna, Werman, Michael

A Randomized Algorithm for Pairwise Clustering

We present a stochastic clustering algorithm based on pairwise similarity of datapoints. Our method extends existing deterministic methods, including agglomerative algorithms, min-cut graph algorithms, and connected components. Thus it provides a common framework for all these methods. Our graph-based method differs from existing stochastic methods which are based on analogy to physical systems. The stochastic nature of our method makes it more robust against noise, including accidental edges and small spurious clusters. We demonstrate the superiority of our algorithm using an example with 3 spiraling bands and a lot of noise. 1 Introduction Clustering algorithms can be divided into two categories: those that require a vectorial representation of the data, and those which use only pairwise representation. In the former case, every data item must be represented as a vector in a real normed space, while in the second case only pairwise relations of similarity or dissimilarity are used.

algorithm, artificial intelligence, machine learning, (16 more...)

Country: Asia > Middle East > Israel (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Neuneier, Ralph, Mihatsch, Oliver

Risk Sensitive Reinforcement Learning

A directed generative model for binary data using a small number of hidden continuous units is investigated. The relationships between the correlations of the underlying continuous Gaussian variables and the binary output variables are utilized to learn the appropriate weights of the network. The advantages of this approach are illustrated on a translationally invariant binary distribution and on handwritten digit images. Introduction Principal Components Analysis (PCA) is a widely used statistical technique for representing data with a large number of variables [1]. It is based upon the assumption that although the data is embedded in a high dimensional vector space, most of the variability in the data is captured by a much lower climensional manifold. In particular for PCA, this manifold is described by a linear hyperplane whose characteristic directions are given by the eigenvectors of the correlation matrix with the largest eigenvalues. The success of PCA and closely related techniques such as Factor Analysis (FA) and PCA mixtures clearly indicate that much real world data exhibit the low dimensional manifold structure assumed by these models [2, 3]. However, the linear manifold structure of PCA is not appropriate for data with binary valued variables.

algorithm, banking & finance, neural network, (18 more...)