AITopics

1312.7167

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Osband, Ian, Russo, Daniel, Van Roy, Benjamin

(More) Efficient Reinforcement Learning via Posterior Sampling

arXiv.org Machine LearningDec-26-2013

Most provably-efficient learning algorithms introduce optimism about poorly-understood states and actions to encourage exploration. We study an alternative approach for efficient exploration, posterior sampling for reinforcement learning (PSRL). This algorithm proceeds in repeated episodes of known duration. At the start of each episode, PSRL updates a prior distribution over Markov decision processes and takes one sample from this posterior. PSRL then follows the policy that is optimal for this sample during the episode. The algorithm is conceptually simple, computationally efficient and allows an agent to encode prior knowledge in a natural way. We establish an $\tilde{O}(\tau S \sqrt{AT})$ bound on the expected regret, where $T$ is time, $\tau$ is the episode length and $S$ and $A$ are the cardinalities of the state and action spaces. This bound is one of the first for an algorithm not based on optimism, and close to the state of the art for any reinforcement learning algorithm. We show through simulation that PSRL significantly outperforms existing algorithms with similar regret bounds.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

1306.094

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Mousavi, Mohammad, Glynn, Peter W.

Shape-constrained Estimation of Value Functions

We present a fully nonparametric method to estimate the value function, via simulation, in the context of expected infinite-horizon discounted rewards for Markov chains. Estimating such value functions plays an important role in approximate dynamic programming and applied probability in general. We incorporate "soft information" into the estimation algorithm, such as knowledge of convexity, monotonicity, or Lipchitz constants. In the presence of such information, a nonparametric estimator for the value function can be computed that is provably consistent as the simulated time horizon tends to infinity. As an application, we implement our method on price tolling agreement contracts in energy markets.

artificial intelligence, optimization problem, value function, (16 more...)

1312.7035

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.46)

Industry: Energy > Oil & Gas (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Chamroukhi, Faicel, Samé, Allou, Govaert, Gérard, Aknin, Patrice

A regression model with a hidden logistic process for feature extraction from time series

A new approach for feature extraction from time series is proposed in this paper. This approach consists of a specific regression model incorporating a discrete hidden logistic process. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The parameters of the hidden logistic process, in the inner loop of the EM algorithm, are estimated using a multi-class Iterative Reweighted Least-Squares (IRLS) algorithm. A piecewise regression algorithm and its iterative variant have also been considered for comparisons. An experimental study using simulated and real data reveals good performances of the proposed approach.

algorithm, artificial intelligence, machine learning, (15 more...)

1312.7001

Country: Europe (0.28)

Genre:

Research Report > New Finding (0.34)
Research Report > Experimental Study (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Chamroukhi, Faicel, Samé, Allou, Govaert, Gérard, Aknin, Patrice

A hidden process regression model for functional data description. Application to curve discrimination

A new approach for functional data description is proposed in this paper. It consists of a regression model with a discrete hidden logistic process which is adapted for modeling curves with abrupt or smooth regime changes. The model parameters are estimated in a maximum likelihood framework through a dedicated Expectation Maximization (EM) algorithm. From the proposed generative model, a curve discrimination rule is derived using the Maximum A Posteriori rule. The proposed model is evaluated using simulated curves and real world curves acquired during railway switch operations, by performing comparisons with the piecewise regression approach in terms of curve modeling and classification.

artificial intelligence, machine learning, regression model, (14 more...)

doi: 10.1016/j.neucom.2009.12.023

1312.6968

Country: Europe > France (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Chamroukhi, Faicel, Samé, Allou, Aknin, Patrice, Govaert, Gérard

Model-based clustering with Hidden Markov Model regression for time series with regime changes

This paper introduces a novel model-based clustering approach for clustering time series which present changes in regime. It consists of a mixture of polynomial regressions governed by hidden Markov chains. The underlying hidden process for each cluster activates successively several polynomial regimes during time. The parameter estimation is performed by the maximum likelihood method through a dedicated Expectation-Maximization (EM) algorithm. The proposed approach is evaluated using simulated time series and real-world time series issued from a railway diagnosis application. Comparisons with existing approaches for time series clustering, including the stand EM for Gaussian mixtures, $K$-means clustering, the standard mixture of regression models and mixture of Hidden Markov Models, demonstrate the effectiveness of the proposed approach.

artificial intelligence, machine learning, time sery, (14 more...)

doi: 10.1109/IJCNN.2011.6033590

1312.7024

Country: North America > United States (0.68)

Genre: Research Report (0.84)

Industry: Transportation > Ground > Rail (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Chamroukhi, Faicel, Glotin, Hervé

Mixture model-based functional discriminant analysis for curve classification

Statistical approaches for Functional Data Analysis concern the paradigm for which the individuals are functions or curves rather than finite dimensional vectors. In this paper, we particularly focus on the modeling and the classification of functional data which are temporal curves presenting regime changes over time. More specifically, we propose a new mixture model-based discriminant analysis approach for functional data using a specific hidden process regression model. Our approach is particularly adapted to both handle the problem of complex-shaped classes of curves, where each class is composed of several sub-classes, and to deal with the regime changes within each homogeneous sub-class. The model explicitly integrates the heterogeneity of each class of curves via a mixture model formulation, and the regime changes within each sub-class through a hidden logistic process. The approach allows therefore for fitting flexible curve-models to each class of complex-shaped curves presenting regime changes through an unsupervised learning scheme, to automatically summarize it into a finite number of homogeneous clusters, each of them is decomposed into several regimes. The model parameters are learned by maximizing the observed-data log-likelihood for each class by using a dedicated expectation-maximization (EM) algorithm. Comparisons on simulated data and real data with alternative approaches, including functional linear discriminant analysis and functional mixture discriminant analysis with polynomial regression mixtures and spline regression mixtures, show that the proposed approach provides better results regarding the discrimination results and significantly improves the curves approximation.

artificial intelligence, bayesian inference, machine learning, (17 more...)

doi: 10.1109/IJCNN.2012.6252818

1312.7018

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)

Chamroukhi, Faicel, Samé, Allou, Govaert, Gérard, Aknin, Patrice

Classification automatique de donn\'ees temporelles en classes ordonn\'ees

This paper proposes a method of segmenting temporal data into ordered classes. It is based on mixture models and a discrete latent process, which enables to successively activates the classes. The classification can be performed by maximizing the likelihood via the EM algorithm or by simultaneously optimizing the model parameters and the partition by the CEM algorithm. These two algorithms can be seen as alternatives to Fisher's algorithm, which improve its computing time.

algorithme, artificial intelligence, machine learning, (16 more...)

1312.7011

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.36)

Chamroukhi, Faicel, Glotin, Heré, Rabouy, Céline

Functional Mixture Discriminant Analysis with hidden process regression for curve classification

We present a new mixture model-based discriminant analysis approach for functional data using a specific hidden process regression model. The approach allows for fitting flexible curve-models to each class of complex-shaped curves presenting regime changes. The model parameters are learned by maximizing the observed-data log-likelihood for each class by using a dedicated expectation-maximization (EM) algorithm. Comparisons on simulated data with alternative approaches show that the proposed approach provides better results.

artificial intelligence, machine learning, regression model, (16 more...)

1312.7007

Genre: Research Report (0.50)

Industry: Health & Medicine (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.70)

Onanena, Raïssa, Chamroukhi, Faicel, Oukhellou, Latifa, Candusso, Denis, Aknin, Patrice, Hissel, Daniel

Supervised learning of a regression model based on latent process. Application to the estimation of fuel cell life time

This paper describes a pattern recognition approach aiming to estimate fuel cell duration time from electrochemical impedance spectroscopy measurements. It consists in first extracting features from both real and imaginary parts of the impedance spectrum. A parametric model is considered in the case of the real part, whereas regression model with latent variables is used in the latter case. Then, a linear regression model using different subsets of extracted features is used fo r the estimation of fuel cell time duration. The performances of the proposed approach are evaluated on experimental data set to show its feasibility. This could lead to interesting perspectives for predictive maintenance policy of fuel cell.

artificial intelligence, machine learning, regression model, (17 more...)

1312.7003

Country: Europe (0.47)

Genre: Research Report (0.50)

Industry:

Energy > Renewable > Hydrogen (1.00)
Energy > Energy Storage (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)