AITopics | Markov Models

Collaborating Authors

Markov Models

News Overviews Instructional Materials AI-Alerts Classics

Age of Information-Aware Radio Resource Management in Vehicular Networks: A Proactive Deep Reinforcement Learning Perspective

Chen, Xianfu, Wu, Celimuge, Chen, Tao, Zhang, Honggang, Liu, Zhi, Zhang, Yan, Bennis, Mehdi

arXiv.org Artificial IntelligenceAug-6-2019

In this paper, we investigate the problem of age of information (AoI)-aware radio resource management for expected long-term performance optimization in a Manhattan grid vehicle-to-vehicle network. With the observation of global network state at each scheduling slot, the roadside unit (RSU) allocates the frequency bands and schedules packet transmissions for all vehicle user equipment-pairs (VUE-pairs). We model the stochastic decision-making procedure as a discrete-time single-agent Markov decision process (MDP). The technical challenges in solving the optimal control policy originate from high spatial mobility and temporally varying traffic information arrivals of the VUE-pairs. To make the problem solving tractable, we first decompose the original MDP into a series of per-VUE-pair MDPs. Then we propose a proactive algorithm based on long short-term memory and deep reinforcement learning techniques to address the partial observability and the curse of high dimensionality in local network state space faced by each VUE-pair. With the proposed algorithm, the RSU makes the optimal frequency band allocation and packet scheduling decision at each scheduling slot in a decentralized way in accordance with the partial observations of the global network state at the VUE-pairs. Numerical experiments validate the theoretical analysis and demonstrate the significant performance improvements from the proposed algorithm.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

1908.02047

Country:

Europe (1.00)
North America > United States > California (0.28)
Asia > Middle East > UAE (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

Add feedback

Batch Recurrent Q-Learning for Backchannel Generation Towards Engaging Agents

Hussain, Nusrah, Erzin, Engin, Sezgin, T. Metin, Yemez, Yucel

arXiv.org Artificial IntelligenceAug-6-2019

The ability to generate appropriate verbal and non-verbal backchannels by an agent during human-robot interaction greatly enhances the interaction experience. Backchannels are particularly important in applications like tutoring and counseling, which require constant attention and engagement of the user. We present here a method for training a robot for backchannel generation during a human-robot interaction within the reinforcement learning (RL) framework, with the goal of maintaining high engagement level. Since online learning by interaction with a human is highly time-consuming and impractical, we take advantage of the recorded human-to-human dataset and approach our problem as a batch reinforcement learning problem. The dataset is utilized as a batch data acquired by some behavior policy. We perform experiments with laughs as a backchannel and train an agent with value-based techniques. In particular, we demonstrate the effectiveness of recurrent layers in the approximate value function for this problem, that boosts the performance in partially observable environments. With off-policy policy evaluation, it is shown that the RL agents are expected to produce more engagement than an agent trained from imitation learning.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

1908.02037

Country: Asia (0.28)

Genre: Research Report (0.50)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Bayesian Incremental Inference Update by Re-using Calculations from Belief Space Planning: A New Paradigm

Farhi, Elad I., Indelman, Vadim

arXiv.org Artificial IntelligenceAug-6-2019

Inference and decision making under uncertainty are key processes in every autonomous system and numerous robotic problems. In recent years, the similarities between inference and decision making triggered much work, from developing unified computational frameworks to pondering about the duality between the two. In spite of these efforts, inference and control, as well as inference and belief space planning (BSP) are still treated as two separate processes. In this paper we propose a paradigm shift, a novel approach which deviates from conventional Bayesian inference and utilizes the similarities between inference and BSP. We make the key observation that inference can be efficiently updated using predictions made during the decision making stage, even in light of inconsistent data association between the two. We developed a two staged process that implements our novel approach and updates inference using calculations from the precursory planning phase. Using autonomous navigation in an unknown environment along with iSAM2 efficient methodologies as a test case, we benchmarked our novel approach against standard Bayesian inference, both with synthetic and real-world data (KITTI dataset). Results indicate that not only our approach improves running time by at least a factor of two while providing the same estimation accuracy, but it also alleviates the computational burden of state dimensionality and loop closures.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1908.02002

Country: Europe (0.68)

Genre: Research Report > Promising Solution (0.74)

Industry: Aerospace & Defense (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Dialogue Act Classification in Group Chats with DAG-LSTMs

İrsoy, Ozan, Gosangi, Rakesh, Zhang, Haimin, Wei, Mu-Hsin, Lund, Peter, Pappadopulo, Duccio, Fahy, Brendan, Nephytou, Neophytos, Ortiz, Camilo

arXiv.org Machine LearningAug-2-2019

Dialogue act (DA) classification has been studied for the past two decades and has several key applications such as workflow automation and conversation analytics. Researchers have used, to address this problem, various traditional machine learning models, and more recently deep neural network models such as hierarchical convolutional neural networks (CNNs) and long short-term memory (LSTM) networks. In this paper, we introduce a new model architecture, directed-acyclic-graph LSTM (DAG-LSTM) for DA classification. A DAG-LSTM exploits the turn-taking structure naturally present in a multi-party conversation, and encodes this relation in its model structure. Using the STAC corpus, we show that the proposed method performs roughly 0.8% better in accuracy and 1.2% better in macro-F1 score when compared to existing methods. The proposed method is generic and not limited to conversation applications.

artificial intelligence, machine learning, utterance, (15 more...)

arXiv.org Machine Learning

1908.01821

Country: Europe > France (0.16)

Genre:

Overview (0.67)
Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Risk Management via Anomaly Circumvent: Mnemonic Deep Learning for Midterm Stock Prediction

Li, Xinyi, Li, Yinchuan, Liu, Xiao-Yang, Wang, Christina Dan

arXiv.org Machine LearningAug-2-2019

Midterm stock price prediction is crucial for value investments in the stock market. However, most deep learning models are essentially short-term and applying them to midterm predictions encounters large cumulative errors because they cannot avoid anomalies. In this paper, we propose a novel deep neural network Mid-LSTM for midterm stock prediction, which incorporates the market trend as hidden states. First, based on the autoregressive moving average model (ARMA), a midterm ARMA is formulated by taking into consideration both hidden states and the capital asset pricing model. Then, a midterm LSTM-based deep neural network is designed, which consists of three components: LSTM, hidden Markov model and linear regression networks. The proposed Mid-LSTM can avoid anomalies to reduce large prediction errors, and has good explanatory effects on the factors affecting stock prices. Extensive experiments on S&P 500 stocks show that (i) the proposed Mid-LSTM achieves 2-4% improvement in prediction accuracy, and (ii) in portfolio allocation investment, we achieve up to 120.16% annual return and 2.99 average Sharpe ratio.

artificial intelligence, machine learning, stock price, (15 more...)

arXiv.org Machine Learning

1908.01112

Country: North America > United States > Alaska (0.16)

Genre: Research Report (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Health-Informed Policy Gradients for Multi-Agent Reinforcement Learning

Allen, Ross E., Bear, Javona White, Gupta, Jayesh K., Kochenderfer, Mykel J.

arXiv.org Artificial IntelligenceAug-2-2019

This paper proposes a definition of system health in the context of multiple agents optimizing a joint reward function. We use this definition as a credit assignment term in a policy gradient algorithm to distinguish the contributions of individual agents to the global reward. The health-informed credit assignment is then extended to a multi-agent variant of the proximal policy optimization algorithm and demonstrated on simple particle environments that have elements of system health, risk-taking, semi-expendable agents, and partial observability. We show significant improvement in learning performance compared to policy gradient methods that do not perform multi-agent credit assignment.

agent, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

1908.01022

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Consumer Health (0.67)
Government (0.47)
Leisure & Entertainment > Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Learning to design from humans: Imitating human designers through deep learning

Raina, Ayush, McComb, Christopher, Cagan, Jonathan

arXiv.org Artificial IntelligenceAug-2-2019

Humans as designers have quite versatile problem-solving strategies. Computer agents on the other hand can access large scale computational resources to solve certain design problems. Hence, if agents can learn from human behavior, a synergetic human-agent problem solving team can be created. This paper presents an approach to extract human design strategies and implicit rules, purely from historical human data, and use that for design generation. A two-step framework that learns to imitate human design strategies from observation is proposed and implemented. This framework makes use of deep learning constructs to learn to generate designs without any explicit information about objective and performance metrics. The framework is designed to interact with the problem through a visual interface as humans did when solving the problem. It is trained to imitate a set of human designers by observing their design state sequences without inducing problem-specific modelling bias or extra information about the problem. Furthermore, an end-to-end agent is developed that uses this deep learning framework as its core in conjunction with image processing to map pixel-to-design moves as a mechanism to generate designs. Finally, the designs generated by a computational team of these agents are then compared to actual human data for teams solving a truss design problem. Results demonstrates that these agents are able to create feasible and efficient truss designs without guidance, showing that this methodology allows agents to learn effective design strategies.

artificial intelligence, machine learning, opération, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1115/1.4044256

1907.11813

Country: North America > United States > Pennsylvania (0.46)

Genre: Research Report > New Finding (0.48)

Industry:

Education (0.67)
Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

Add feedback

Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes

Agarwal, Alekh, Kakade, Sham M., Lee, Jason D., Mahajan, Gaurav

arXiv.org Machine LearningAug-1-2019

Policy gradient methods are among the most effective methods in challenging reinforcement learning problems with large state and/or action spaces. However, little is known about even their most basic theoretical convergence properties, including: if and how fast they converge to a globally optimal solution (say with a sufficiently rich policy class); how they cope with approximation error due to using a restricted class of parametric policies; or their finite sample behavior. Such characterizations are important not only to compare these methods to their approximate value function counterparts (where such issues are relatively well understood, at least in the worst case) but also to help with more principled approaches to algorithm design. This work provides provable characterizations of computational, approximation, and sample size issues with regards to policy gradient methods in the context of discounted Markov Decision Processes (MDPs). We focus on both: 1) "tabular" policy parameterizations, where the optimal policy is contained in the class and where we show global convergence to the optimal policy, and 2) restricted policy classes, which may not contain the optimal policy and where we provide agnostic learning results. One insight of this work is in formalizing the importance how a favorable initial state distribution provides a means to circumvent worst-case exploration issues. Overall, these results place policy gradient methods under a solid theoretical footing, analogous to the global convergence guarantees of iterative value function based algorithms.

machine learning, parameterization, reinforcement learning, (18 more...)

arXiv.org Machine Learning

1908.00261

Country: North America > United States > California (0.67)

Genre: Research Report (1.00)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.84)

Add feedback

Reinforcement Learning for Personalized Dialogue Management

Hengst, Floris den, Hoogendoorn, Mark, van Harmelen, Frank, Bosman, Joost

arXiv.org Artificial IntelligenceAug-1-2019

Language systems have been of great interest to the research community and have recently reached the mass market through various assistant platforms on the web. Reinforcement Learning methods that optimize dialogue policies have seen successes in past years and have recently been extended into methods that personalize the dialogue, e.g. take the personal context of users into account. These works, however, are limited to personalization to a single user with whom they require multiple interactions and do not generalize the usage of context across users. This work introduces a problem where a generalized usage of context is relevant and proposes two Reinforcement Learning (RL)-based approaches to this problem. The first approach uses a single learner and extends the traditional POMDP formulation of dialogue state with features that describe the user context. The second approach segments users by context and then employs a learner per context. We compare these approaches in a benchmark of existing non-RL and RL-based methods in three established and one novel application domain of financial product recommendation. We compare the influence of context and training experiences on performance and find that learning approaches generally outperform a handcrafted gold standard.

machine learning, natural language, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

1908.00286

Genre:

Research Report (1.00)
Overview > Innovation (0.34)

Industry: Banking & Finance (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)

Add feedback

Learning When to Drive in Intersections by Combining Reinforcement Learning and Model Predictive Control

Tram, Tommy, Batkovic, Ivo, Ali, Mohammad, Sjöberg, Jonas

arXiv.org Artificial IntelligenceJul-31-2019

Learning When to Drive in Intersections by Combining Reinforcement Learning and Model Predictive Control Tommy Tram 1, 2, 3, Ivo Batkovic 1, 2, 3, Mohammad Ali 1, and Jonas Sj oberg 2 Abstract -- In this paper, we propose a decision making algorithm intended for automated vehicles that negotiate with other possibly non-automated vehicles in intersections. The decision algorithm is separated into two parts: a high-level decision module based on reinforcement learning, and a low-level planning module based on model predictive control. Traffic is simulated with numerous predefined driver behaviors and intentions, and the performance of the proposed decision algorithm was evaluated against another controller . The results show that the proposed decision algorithm yields shorter training episodes and an increased performance in success rate compared to the other controller . Interactions between road users in intersections is a complex problem to solve, making it difficult to address using conventional rule based systems. Many advancements aim to solve this problem by trying to imitate human drivers [1] or predicting what other drivers in traffic are planning to do [2]. In [3], the authors show that by modeling the decision process as a partially observable Markov decision process, the model can account for uncertainty in sensing the environment and [4] showed some probabilistic guarantees when solving the problem using reinforcement learning (RL).

ground transportation, intersection, upstream oil & gas, (20 more...)

arXiv.org Artificial Intelligence

1908.00177

Country: Europe > Sweden (0.14)

Genre: Research Report > New Finding (0.86)

Industry:

Transportation (0.95)
Energy > Oil & Gas > Upstream (0.90)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback