AITopics

Since BLHT learns a stochastic model based on Bayesian Learning, the overfitting problemis reasonably solved. Moreover, BLHT has an efficient implementation. This paper shows that the model learned by BLHT converges toone which provides the most accurate predictions of percepts and rewards, given short-term memory. 1 INTRODUCTION Research on Reinforcement Learning (RL) problem forpartially observable environments is gaining more attention recently. This is mainly because the assumption that perfect and complete perception of the state of the environment is available for the learning agent, which many previous RL algorithms require, is not valid for many realistic environments.

artificial intelligence, reinforcement learning, short-term memory, (16 more...)

Country: Asia > Japan (0.15)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Held, Marcus, Puzicha, Jan, Buhmann, Joachim M.

Visualizing Group Structure

Cluster analysis is a fundamental principle in exploratory data analysis, providing the user with a description of the group structure ofgiven data. A key problem in this context is the interpretation andvisualization of clustering solutions in high-dimensional or abstract data spaces. In particular, probabilistic descriptions of the group structure, essential to capture inter-cluster relationships, arehardly assessable by simple inspection ofthe probabilistic assignment variables. VVe present a novel approach to the visualization ofgroup structure. It is based on a statistical model of the object assignments which have been observed or estimated by a probabilistic clustering procedure. The objects or data points are embedded in a low dimensional Euclidean space by approximating the observed data statistics with a Gaussian mixture model. The algorithm provides a new approach to the visualization of the inherent structurefor a broad variety of data types, e.g.

artificial intelligence, database, machine learning, (16 more...)

Country: North America > United States > Ohio (0.14)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.70)

Hollmén, Jaakko, Tresp, Volker

Call-Based Fraud Detection in Mobile Communication Networks Using a Hierarchical Regime-Switching Model

Fraud causes substantial losses to telecommunication carriers.

law enforcement, probability, public safety, (20 more...)

Country:

Europe > Germany (0.14)
Europe > Finland (0.14)
Europe > Denmark (0.14)

Industry:

Telecommunications (1.00)
Law Enforcement & Public Safety > Fraud (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Tight Bounds for the VC-Dimension of Piecewise Polynomial Networks

Sakurai, Akito

ASakurai@jaist.ac.jp Abstract O(ws(s log d log(dqh/ s))) and O(ws((h/ s) log q) log(dqh/s)) are upper bounds for the VC-dimension of a set of neural networks of units with piecewise polynomial activation functions, where s is the depth of the network, h is the number of hidden units, w is the number of adjustable parameters, q is the maximum of the number of polynomial segments of the activation function, and d is the maximum degree of the polynomials; also n(wslog(dqh/s)) is a lower bound for the VC-dimension of such a network set, which are tight for the cases s 8(h) and s is constant. For the special case q 1, the VC-dimension is 8(ws log d). 1 Introduction In spite of its importance, we had been unable to obtain VC-dimension values for practical types of networks, until fairly tight upper and lower bounds were obtained ([6], [8], [9], and [10]) for linear threshold element networks in which all elements perform a threshold function on weighted sum of inputs. This is mainly because the differentiability ofthe functions is needed to perform backpropagation or other learning algorithms. Unfortunately explicit bounds obtained so far for the VC-dimension of sigmoidal networks exhibit large gaps (O(w2h2) ([3]), n(w log h) for bounded depth 324 A.Sakurai and f!(wh) for unbounded depth) and are hard to improve. For the piecewise linear case, Maass obtained a result that the VO-dimension is O(w210g q), where q is the number of linear pieces of the function ([5]).

artificial intelligence, neural network, vc-dimension, (14 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Zemel, Richard S., Dayan, Peter

Distributional Population Codes and Multiple Motion Models

Thresholds for movement direction: two directions are less detectable than one.

health & medicine, neural network, neurology, (17 more...)

Country:

North America > United States (0.14)
South America > Venezuela (0.14)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.48)

Technology: Information Technology > Artificial Intelligence > Cognitive Science (0.30)

Bayesian PCA

Bishop, Christopher M.

The technique of principal component analysis (PCA) has recently been expressed as the maximum likelihood solution for a generative latent variable model. In this paper we use this probabilistic reformulation as the basis for a Bayesian treatment of PCA. Our key result is that effective dimensionalityof the latent space (equivalent to the number of retained principal components) can be determined automatically as part of the Bayesian inference procedure. An important application of this framework is to mixtures of probabilistic PCA models, in which each component can determine its own effective complexity. 1 Introduction Principal component analysis (PCA) is a widely used technique for data analysis. Recently Tipping and Bishop (1997b) showed that a specific form of generative latent variable model has the property that its maximum likelihood solution extracts the principal subspace of the observed data set.

artificial intelligence, bayesian inference, dimensionality, (18 more...)

Country: Europe > United Kingdom (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Chen, Ke, Wang, DeLiang L.

Perceiving without Learning: From Spirals to Inside/Outside Relations

As a benchmark task, the spiral problem is well known in neural networks. Unlikeprevious work that emphasizes learning, we approach the problem from a generic perspective that does not involve learning. We point out that the spiral problem is intrinsically connected to the inside/outside problem.A generic solution to both problems is proposed based on oscillatory correlation using a time delay network. Our simulation resultsare qualitatively consistent with human performance, and we interpret human limitations in terms of synchrony and time delays, both biologically plausible. As a special case, our network without time delays can always distinguish these figures regardless of shape, position, size, and orientation.

health & medicine, neural network, oscillator, (17 more...)

Country: North America > United States > Ohio (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Blake, Andrew, North, Ben, Isard, Michael

Learning Multi-Class Dynamics

Yule-Walker) are available for learning Auto-Regressive process models of simple, directly observable, dynamical processes.When sensor noise means that dynamics are observed only approximately, learning can still been achieved via Expectation-Maximisation (EM) together with Kalman Filtering. However, this does not handle more complex dynamics, involving multiple classes of motion.

algorithm, artificial intelligence, machine learning, (17 more...)

Country: North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Ikeda, Shiro, Amari, Shun-ichi, Nakahara, Hiroyuki

Convergence of the Wake-Sleep Algorithm

The WS (Wake-Sleep) algorithm is a simple learning rule for the models with hidden variables. It is shown that this algorithm can be applied to a factor analysis model which is a linear version of the Helmholtz machine. Buteven for a factor analysis model, the general convergence is not proved theoretically.

algorithm, deep learning, neural network, (17 more...)

Country: Asia > Japan (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)