Markov Models
Vulcan: A Monte Carlo Algorithm for Large Chance Constrained MDPs with Risk Bounding Functions
Ayton, Benjamin J, Williams, Brian C
Chance Constrained Markov Decision Processes maximize reward subject to a bounded probability of failure, and have been frequently applied for planning with potentially dangerous outcomes or unknown environments. Solution algorithms have required strong heuristics or have been limited to relatively small problems with up to millions of states, because the optimal action to take from a given state depends on the probability of failure in the rest of the policy, leading to a coupled problem that is difficult to solve. In this paper we examine a generalization of a CCMDP that trades off probability of failure against reward through a functional relationship. We derive a constraint that can be applied to each state history in a policy individually, and which guarantees that the chance constraint will be satisfied. The approach decouples states in the CCMDP, so that large problems can be solved efficiently. We then introduce Vulcan, which uses our constraint in order to apply Monte Carlo Tree Search to CCMDPs. Vulcan can be applied to problems where it is unfeasible to generate the entire state space, and policies must be returned in an anytime manner. We show that Vulcan and its variants run tens to hundreds of times faster than linear programming methods, and over ten times faster than heuristic based methods, all without the need for a heuristic, and returning solutions with a mean suboptimality on the order of a few percent. Finally, we use Vulcan to solve for a chance constrained policy in a CCMDP with over $10^{13}$ states in 3 minutes.
Bounded Rational Decision-Making with Adaptive Neural Network Priors
Hihn, Heinke, Gottwald, Sebastian, Braun, Daniel A.
Bounded rationality investigates utility-optimizing decision-makers with limited information-processing power. In particular, information theoretic bounded rationality models formalize resource constraints abstractly in terms of relative Shannon information, namely the Kullback-Leibler Divergence between the agents' prior and posterior policy. Between prior and posterior lies an anytime deliberation process that can be instantiated by sample-based evaluations of the utility function through Markov Chain Monte Carlo (MCMC) optimization. The most simple model assumes a fixed prior and can relate abstract information-theoretic processing costs to the number of sample evaluations. However, more advanced models would also address the question of learning, that is how the prior is adapted over time such that generated prior proposals become more efficient. In this work we investigate generative neural networks as priors that are optimized concurrently with anytime sample-based decision-making processes such as MCMC. We evaluate this approach on toy examples.
Three-Stage Speaker Verification Architecture in Emotional Talking Environments
Shahin, Ismail, Nassif, Ali Bou
Speaker verification performance in neutral talking environment is usually high, while it is sharply decreased in emotional talking environments. This performance degradation in emotional environments is due to the problem of mismatch between training in neutral environment while testing in emotional environments. In this work, a three-stage speaker verification architecture has been proposed to enhance speaker verification performance in emotional environments. This architecture is comprised of three cascaded stages: gender identification stage followed by an emotion identification stage followed by a speaker verification stage. The proposed framework has been evaluated on two distinct and independent emotional speech datasets: in-house dataset and Emotional Prosody Speech and Transcripts dataset. Our results show that speaker verification based on both gender information and emotion information is superior to each of speaker verification based on gender information only, emotion information only, and neither gender information nor emotion information. The attained average speaker verification performance based on the proposed framework is very alike to that attained in subjective assessment by human listeners.
Prognostics Estimations with Dynamic States
Bao, Rong-Jing, Rong, Hai-Jun, Yang, Zhi-Xin, Chen, Badong
The health state assessment and remaining useful life (RUL) estimation play very important roles in prognostics and health management (PHM), owing to their abilities to reduce the maintenance and improve the safety of machines or equipment. However, they generally suffer from this problem of lacking prior knowledge to pre-define the exact failure thresholds for a machinery operating in a dynamic environment with a high level of uncertainty. In this case, dynamic thresholds depicted by the discrete states is a very attractive way to estimate the RUL of a dynamic machinery. Currently, there are only very few works considering the dynamic thresholds, and these studies adopted different algorithms to determine the discrete states and predict the continuous states separately, which largely increases the complexity of the learning process. In this paper, we propose a novel prognostics approach for RUL estimation of aero-engines with self-joint prediction of continuous and discrete states, wherein the prediction of continuous and discrete states are conducted simultaneously and dynamically within one learning framework.
Enhancing Stock Market Prediction with Extended Coupled Hidden Markov Model over Multi-Sourced Data
Zhang, Xi, Li, Yixuan, Wang, Senzhang, Fang, Binxing, Yu, Philip S.
Noname manuscript No. (will be inserted by the editor) Abstract Traditional stock market prediction methods commonly only utilize the historical trading data, ignoring the fact that stock market fluctuations can be impacted by various other information sources such as stock related events. Although some recent works propose event-driven prediction approaches by considering the event data, how to leverage the joint impacts of multiple data sources still remains an open research problem. In this work, we study how to explore multiple data sources to improve the performance of the stock prediction. We introduce an Extended Coupled Hidden Markov Model incorporating the news events with the historical trading data. To address the data sparsity issue of news events for each single stock, we further study the fluctuation correlations between the stocks and incorporate the correlations into the model to facilitate the prediction task. Keywords Stock prediction ยท Event extraction ยท Information fusion ยท Hidden Markov Model 1 Introduction The capability of predicting the stock price movement directions can offer enormous arbitrage profit opportunities and thus attract much attention from both academia and industry. Conventional quantitative trading prediction methods are mostly based on the historical trading data such as prices and volumes. According to the Efficient Market Hypothesis (EMH) [16], stock prices are the reflection of all known information. Key Laboratory of Trustworthy Distributed Computing and Service (Beijing University of Posts and Telecommunications), Ministry of Education, Beijing, China. As more and more investors obtain information from social media [49, 57], the indicators obtained from Web news articles and social networks can also have significant impacts on the stock prices, and thus such factors that can derive the stock price fluctuations must be considered. As such, there are growing research interests in exploring financial text documents such as news articles, financial standings to facilitate the stock prediction task.
r/MachineLearning - [P] Tabular implementations of 30 MDP and POMDP papers
One issue might be that many people have moved to ALE & OpenAI's Gym interface for API/environment implementations, and Python for implementation language. Your C library makes Python sound like a very second-class citizen, which is discouraging, and C is increasingly disfavored for its complexity & low-level nature. Just to get started with this, one has to learn the'Cassandra POMDP format', whatever that is, and then deal with C rather than Python. Are there that many people who want to solve MDPs in a tabular form whose preferred language is C and love defining their models in Cassandra POMDP format? You also don't have any impressive use-cases or demos of things which one can do easily in AIToolbox which can't be done elsewhere as easily, or as fast, or at all - what gives me any confidence that this is really mature and I won't simply invest days into learning it only to discover some severe limitation which makes it useless for me?
Bayesian Classifier for Route Prediction with Markov Chains
Epperlein, Jonathan P., Monteil, Julien, Liu, Mingming, Gu, Yingqi, Zhuk, Sergiy, Shorten, Robert
In the presented framework, known journey patterns are modelled as stochastic processes, emitting the road segments visited during the journey, and the ongoing journey is predicted by updating the posterior probability of each journey pattern given the road segments visited so far. In this contribution, we use Markov chains as models for the journey patterns, and consider the prediction as final, once one of the posterior probabilities crosses a predefined threshold. Despite the simplicity of both, examples run on a synthetic dataset demonstrate high accuracy of the made predictions.
Using Machine Learning to Assess Physician Competence: A... : Academic Medicine
Purpose: To identify the different machine learning (ML) techniques that have been applied to automate physician competence assessment and evaluate how these techniques can be used to assess different competence domains in several medical specialties. Method: In May 2017, MEDLINE, EMBASE, PsycINFO, Web of Science, ACM Digital Library, IEEE Xplore Digital Library, PROSPERO, and Cochrane Database of Systematic Reviews were searched for articles published from inception to April 30, 2017. Studies were included if they applied at least one ML technique to assess medical students', residents', fellows', or attending physicians' competence. Information on sample size, participants, study setting and design, medical specialty, ML techniques, competence domains, outcomes, and methodological quality was extracted. MERSQI was used to evaluate quality, and a qualitative narrative synthesis of the medical specialties, ML techniques, and competence domains was conducted.
Online ICA: Understanding Global Dynamics of Nonconvex Optimization via Diffusion Processes
Li, Chris Junchi, Wang, Zhaoran, Liu, Han
Solving statistical learning problems often involves nonconvex optimization. Despite the empirical success of nonconvex statistical optimization methods, their global dynamics, especially convergence to the desirable local minima, remain less well understood in theory. In this paper, we propose a new analytic paradigm based on diffusion processes to characterize the global dynamics of nonconvex statistical optimization. As a concrete example, we study stochastic gradient descent (SGD) for the tensor decomposition formulation of independent component analysis. In particular, we cast different phases of SGD into diffusion processes, i.e., solutions to stochastic differential equations. Initialized from an unstable equilibrium, the global dynamics of SGD transit over three consecutive phases: (i) an unstable Ornstein-Uhlenbeck process slowly departing from the initialization, (ii) the solution to an ordinary differential equation, which quickly evolves towards the desirable local minimum, and (iii) a stable Ornstein-Uhlenbeck process oscillating around the desirable local minimum. Our proof techniques are based upon Stroock and Varadhan's weak convergence of Markov chains to diffusion processes, which are of independent interest.
Diffusion Approximations for Online Principal Component Estimation and Global Convergence
Li, Chris Junchi, Wang, Mengdi, Liu, Han, Zhang, Tong
In this paper, we propose to adopt the diffusion approximation tools to study the dynamics of Oja's iteration which is an online stochastic gradient descent method for the principal component analysis. Oja's iteration maintains a running estimate of the true principal component from streaming data and enjoys less temporal and spatial complexities. We show that the Oja's iteration for the top eigenvector generates a continuous-state discrete-time Markov chain over the unit sphere. We characterize the Oja's iteration in three phases using diffusion approximation and weak convergence tools. Our three-phase analysis further provides a finite-sample error bound for the running estimate, which matches the minimax information lower bound for principal component analysis under the additional assumption of bounded samples.