AITopics | Markov Models

Collaborating Authors

Markov Models

News Overviews Instructional Materials AI-Alerts Classics

Particle Probability Hypothesis Density Filter based on Pairwise Markov Chains

arXiv.org Artificial IntelligenceNov-27-2018

Most multi-target tracking filters assume that one target and its observation follow a Hidden Markov Chain (HMC) model, but the implicit independence assumption of HMC model is invalid in many practical applications, and a Pairwise Markov Chain (PMC) model is more universally suitable than traditional HMC model. A particle probability hypothesis density filter based on PMC model (PF-PMC-PHD) is proposed for the nonlinear multi-target tracking system. Simulation results show the effectiveness of PF-PMC-PHD filter, and that the tracking performance of PF-PMC-PHD filter is superior to the particle PHD filter based on HMC model in a scenario where we kept the local physical properties of nonlinear and Gaussian HMC models while relaxing their independence assumption.

artificial intelligence, machine learning, pf-pmc-phd filter, (13 more...)

arXiv.org Artificial Intelligence

1811.12211

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Optimization of Information-Seeking Dialogue Strategy for Argumentation-Based Dialogue System

Katsumi, Hisao, Hiraoka, Takuya, Yoshino, Koichiro, Yamamoto, Kazeto, Motoura, Shota, Sadamasa, Kunihiko, Nakamura, Satoshi

arXiv.org Artificial IntelligenceNov-26-2018

Argumentation-based dialogue systems, which can handle and exchange arguments through dialogue, have been widely researched. It is required that these systems have sufficient supporting information to argue their claims rationally; however, the systems often do not have enough of such information in realistic situations. One way to fill in the gap is acquiring such missing information from dialogue partners (information-seeking dialogue). Existing information-seeking dialogue systems are based on handcrafted dialogue strategies that exhaustively examine missing information. However, the proposed strategies are not specialized in collecting information for constructing rational arguments. Moreover, the number of system's inquiry candidates grows in accordance with the size of the argument set that the system deal with. In this paper, we formalize the process of information-seeking dialogue as Markov decision processes (MDPs) and apply deep reinforcement learning (DRL) for automatically optimizing a dialogue strategy. By utilizing DRL, our dialogue strategy can successfully minimize objective functions, the number of turns it takes for our system to collect necessary information in a dialogue. We conducted dialogue experiments using two datasets from different domains of argumentative dialogue. Experimental results show that the proposed formalization based on MDP works well, and the policy optimized by DRL outperformed existing heuristic dialogue strategies.

machine learning, natural language, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

1811.10728

Genre: Research Report > New Finding (0.48)

Industry: Law (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

Add feedback

What Should I Learn First: Introducing LectureBank for NLP Education and Prerequisite Chain Learning

Li, Irene, Fabbri, Alexander R., Tung, Robert R., Radev, Dragomir R.

arXiv.org Machine LearningNov-26-2018

Recent years have witnessed the rising popularity of Natural Language Processing (NLP) and related fields such as Artificial Intelligence (AI) and Machine Learning (ML). Many online courses and resources are available even for those without a strong background in the field. Often the student is curious about a specific topic but does not quite know where to begin studying. To answer the question of "what should one learn first," we apply an embedding-based method to learn prerequisite relations for course concepts in the domain of NLP. We introduce LectureBank, a dataset containing 1,352 English lecture files collected from university courses which are each classified according to an existing taxonomy as well as 208 manually-labeled prerequisite relation topics, which is publicly available. The dataset will be useful for educational purposes such as lecture preparation and organization as well as applications such as reading list generation. Additionally, we experiment with neural graph-based networks and non-neural classifiers to learn these prerequisite relations from our dataset.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

1811.12181

Country: North America > United States > California (0.28)

Genre:

Instructional Material > Course Syllabus & Notes (1.00)
Research Report (0.82)

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.89)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
(4 more...)

Add feedback

A Policy Gradient Method with Variance Reduction for Uplift Modeling

Li, Chenchen, Yan, Xiang, Deng, Xiaotie, Qi, Yuan, Chu, Wei, Song, Le, Qiao, Junlong, He, Jianshan, Xiong, Junwu

arXiv.org Machine LearningNov-25-2018

Uplift modeling aims to directly model the incremental impact of a treatment on an individual response. It has been widely and successfully used in healthcare analytics and business operations, where one tries to measure the net effect of a new medicine on patients or to understand the impact of a marketing campaign on company revenue. In this work, we address the problem from a new angle and reformulate it as a Markov Decision Process (MDP). This new formulation allows us to handle the lack of explicit labels, to deal with any number of actions (in comparison to the normal two action uplift modeling), and to apply it to applications with responses of general types, which is a challenging task for previous methods. Furthermore, we also design an unbiased metric for more accurate offline evaluation of uplift effects, set up a better reward function for the policy gradient method to solve the problem and adopt some action-based baselines to reduce variance. We conducted extensive experiments on both a synthetic dataset and real-world scenarios, and showed that our method can achieve significant improvement over previous methods.

dataset, uplift modeling, uplift response, (15 more...)

arXiv.org Machine Learning

1811.10158

Country:

North America > Montserrat (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.68)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.94)
(2 more...)

Add feedback

Forecasting market states

Procacci, Pier Francesco, Aste, Tomaso

arXiv.org Machine LearningNov-25-2018

In common terminology, there are periods of'bull' market in which prices are more likely to rise and periods of'bear' market in which prices are more likely to fall. These different'states' of markets are commonly attributed in literature to unobservable, orlatent, regimes representing a set of macroeconomic, market and sentiment variables. Many time series models presented in literature tried to capture this phenomenon. Among the most popular methods, it is worth mentioning the TAR models (Tong 1978), trying to estimate'structural breaks' in the time series process, and the Markov Switching models (Hamilton 1989), where the change in regimes are parametrized by means of an unobserved state variable typically modelledas Markov chain. However, the application of TAR models in finance is frequently criticized since it cannot be established with certainty when a structural break has occurred in economic time series and the prior knowledge of major economic events could lead to bias in inference (Campbellet al. 1997). Markov switching models, on the other hand, are highly affected by the curse of dimensionality. In particular, for slightly more complex dynamics than the original proposal (Hamilton 1989), we need to rely on variational inference techniques or MCMC methods (Tsay 2005, Kim and Nelson 1999). This implies that, in a multivariate context and particularly if November 27, 2018 ForecastingMarketStates v2.1

artificial intelligence, machine learning, precision matrix, (14 more...)

arXiv.org Machine Learning

1807.05836

Country:

North America > United States (0.68)
Europe (0.46)

Genre: Research Report (1.00)

Industry:

Banking & Finance > Trading (1.00)
Banking & Finance > Economy (0.88)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Markov Chain Block Coordinate Descent

Sun, Tao, Sun, Yuejiao, Xu, Yangyang, Yin, Wotao

arXiv.org Machine LearningNov-21-2018

The method of block coordinate gradient descent (BCD) has been a powerful method for large-scale optimization. This paper considers the BCD method that successively updates a series of blocks selected according to a Markov chain. This kind of block selection is neither i.i.d. random nor cyclic. On the other hand, it is a natural choice for some applications in distributed optimization and Markov decision process, where i.i.d. random and cyclic selections are either infeasible or very expensive. By applying mixing-time properties of a Markov chain, we prove convergence of Markov chain BCD for minimizing Lipschitz differentiable functions, which can be nonconvex. When the functions are convex and strongly convex, we establish both sublinear and linear convergence rates, respectively. We also present a method of Markov chain inertial BCD. Finally, we discuss potential applications.

artificial intelligence, machine learning, markov chain, (15 more...)

arXiv.org Machine Learning

1811.0899

Country: North America > United States > California (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Black-Box Autoregressive Density Estimation for State-Space Models

Ryder, Tom, Golighty, Andrew, McGough, A. Stephen, Prangle, Dennis

arXiv.org Machine LearningNov-21-2018

State-space models (SSMs) provide a flexible framework for modelling time-series data. Consequently, SSMs are ubiquitously applied in areas such as engineering, econometrics and epidemiology. In this paper we provide a fast approach for approximate Bayesian inference in SSMs using the tools of deep learning and variational inference.

artificial intelligence, inference, machine learning, (12 more...)

arXiv.org Machine Learning

1811.08337

Country:

Europe (1.00)
North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Epidemiology (0.67)
Health & Medicine > Therapeutic Area (0.50)
Transportation > Air (0.42)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Model-Based Reinforcement Learning in Contextual Decision Processes

Sun, Wen, Jiang, Nan, Krishnamurthy, Akshay, Agarwal, Alekh, Langford, John

arXiv.org Machine LearningNov-20-2018

We study the sample complexity of model-based reinforcement learning in general contextual decision processes. We design new algorithms for RL with an abstract model class and analyze their statistical properties. Our algorithms have sample complexity governed by a new structural parameter called the witness rank, which we show to be small in several settings of interest, including Factored MDPs and reactive POMDPs. We also show that the witness rank of a problem is never larger than the recently proposed Bellman rank parameter governing the sample complexity of the model-free algorithm OLIVE (Jiang et al., 2017), the only other provably sample efficient algorithm at this level of generality. Focusing on the special case of Factored MDPs, we prove an exponential lower bound for all model-free approaches, including OLIVE, which when combined with our algorithmic results demonstrates exponential separation between model-based and model-free RL in some rich-observation settings.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Machine Learning

1811.0854

Country: North America > United States (0.67)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

Add feedback

Gen-Oja: A Simple and Efficient Algorithm for Streaming Generalized Eigenvector Computation

Bhatia, Kush, Pacchiano, Aldo, Flammarion, Nicolas, Bartlett, Peter L., Jordan, Michael I.

arXiv.org Machine LearningNov-20-2018

In this paper, we study the problems of principal Generalized Eigenvector computation and Canonical Correlation Analysis in the stochastic setting. We propose a simple and efficient algorithm, Gen-Oja, for these problems. We prove the global convergence of our algorithm, borrowing ideas from the theory of fast-mixing Markov chains and two-time-scale stochastic approximation, showing that it achieves the optimal rate of convergence. In the process, we develop tools for understanding stochastic processes with Markovian noise which might be of independent interest.

artificial intelligence, machine learning, rt 1, (17 more...)

arXiv.org Machine Learning

1811.08393

Genre:

Workflow (0.67)
Research Report (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Urban Driving with Multi-Objective Deep Reinforcement Learning

Li, Changjian, Czarnecki, Krzysztof

arXiv.org Artificial IntelligenceNov-20-2018

Autonomous driving is a challenging domain that entails multiple aspects: a vehicle should be able to drive to its destination as fast as possible while avoiding collision, obeying traffic rules and ensuring the comfort of passengers. In this paper, we present a deep learning variant of thresholded lexicographic Q-learning for the task of urban driving. Our multi-objective DQN agent learns to drive on multi-lane roads and intersections, yielding and changing lanes according to traffic rules. We also propose an extension for factored Markov Decision Processes to the DQN architecture that provides auxiliary features for the Q function. This is shown to significantly improve data efficiency. We then show that the learned policy is able to zero-shot transfer to a ring road without sacrificing performance. To our knowledge, this is the first reinforcement learning based autonomous driving agent in literature that can handle multi-lane intersections with traffic rules.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

1811.08586

Country:

North America > Canada (0.93)
North America > United States > New York (0.14)

Genre: Research Report (0.51)

Industry: Transportation > Ground > Road (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback