Goto

Collaborating Authors

 Markov Models


Simultaneous Perception-Action Design via Invariant Finite Belief Sets

arXiv.org Artificial Intelligence

Although perception is an increasingly dominant portion of the overall computational cost for autonomous systems, only a fraction of the information perceived is likely to be relevant to the current task. To alleviate these perception costs, we develop a novel simultaneous perception-action design framework wherein an agent senses only the task-relevant information. This formulation differs from that of a partially observable Markov decision process, since the agent is free to synthesize not only its policy for action selection but also its belief-dependent observation function. The method enables the agent to balance its perception costs with those incurred by operating in its environment. To obtain a computationally tractable solution, we approximate the value function using a novel method of invariant finite belief sets, wherein the agent acts exclusively on a finite subset of the continuous belief space. We solve the approximate problem through value iteration in which a linear program is solved individually for each belief state in the set, in each iteration. Finally, we prove that the value functions, under an assumption on their structure, converge to their continuous state-space values as the sample density increases.


Event-Based Communication in Multi-Agent Distributed Q-Learning

arXiv.org Artificial Intelligence

We present in this work an approach to reduce the communication of information needed on a multi-agent learning system inspired by Event Triggered Control (ETC) techniques. We consider a baseline scenario of a distributed Q-learning problem on a Markov Decision Process (MDP). Following an event-based approach, N agents explore the MDP and communicate experiences to a central learner only when necessary, which performs updates of the actor Q functions. We analyse the convergence guarantees retained with respect to a regular Q-learning algorithm, and present experimental results showing that event-based communication results in a substantial reduction of data transmission rates in such distributed systems. Additionally, we discuss what effects (desired and undesired) these event-based approaches have on the learning processes studied, and how they can be applied to more complex multi-agent learning systems.


Accounting for Variations in Speech Emotion Recognition with Nonparametric Hierarchical Neural Network

arXiv.org Artificial Intelligence

In recent years, deep-learning-based speech emotion recognition models have outperformed classical machine learning models. Previously, neural network designs, such as Multitask Learning, have accounted for variations in emotional expressions due to demographic and contextual factors. However, existing models face a few constraints: 1) they rely on a clear definition of domains (e.g. gender, noise condition, etc.) and the availability of domain labels; 2) they often attempt to learn domain-invariant features while emotion expressions can be domain-specific. In the present study, we propose the Nonparametric Hierarchical Neural Network (NHNN), a lightweight hierarchical neural network model based on Bayesian nonparametric clustering. In comparison to Multitask Learning approaches, the proposed model does not require domain/task labels. In our experiments, the NHNN models generally outperform the models with similar levels of complexity and state-of-the-art models in within-corpus and cross-corpus tests. Through clustering analysis, we show that the NHNN models are able to learn group-specific features and bridge the performance gap between groups.


Deep Active Inference for Pixel-Based Discrete Control: Evaluation on the Car Racing Problem

arXiv.org Artificial Intelligence

Despite the potential of active inference for visual-based control, learning the model and the preferences (priors) while interacting with the environment is challenging. Here, we study the performance of a deep active inference (dAIF) agent on OpenAI's car racing benchmark, where there is no access to the car's state. The agent learns to encode the world's state from high-dimensional input through unsupervised representation learning. State inference and control are learned end-to-end by optimizing the expected free energy. Results show that our model achieves comparable performance to deep Q-learning. However, vanilla dAIF does not reach state-of-the-art performance compared to other world model approaches. Hence, we discuss the current model implementation's limitations and potential architectures to overcome them.


Risk-Averse Decision Making Under Uncertainty

arXiv.org Artificial Intelligence

A large class of decision making under uncertainty problems can be described via Markov decision processes (MDPs) or partially observable MDPs (POMDPs), with application to artificial intelligence and operations research, among others. Traditionally, policy synthesis techniques are proposed such that a total expected cost or reward is minimized or maximized. However, optimality in the total expected cost sense is only reasonable if system behavior in the large number of runs is of interest, which has limited the use of such policies in practical mission-critical scenarios, wherein large deviations from the expected behavior may lead to mission failure. In this paper, we consider the problem of designing policies for MDPs and POMDPs with objectives and constraints in terms of dynamic coherent risk measures, which we refer to as the constrained risk-averse problem. For MDPs, we reformulate the problem into a infsup problem via the Lagrangian framework and propose an optimization-based method to synthesize Markovian policies. For MDPs, we demonstrate that the formulated optimization problems are in the form of difference convex programs (DCPs) and can be solved by the disciplined convex-concave programming (DCCP) framework. We show that these results generalize linear programs for constrained MDPs with total discounted expected costs and constraints. For POMDPs, we show that, if the coherent risk measures can be defined as a Markov risk transition mapping, an infinite-dimensional optimization can be used to design Markovian belief-based policies. For stochastic finite-state controllers (FSCs), we show that the latter optimization simplifies to a (finite-dimensional) DCP and can be solved by the DCCP framework. We incorporate these DCPs in a policy iteration algorithm to design risk-averse FSCs for POMDPs.


A Survey of Deep Reinforcement Learning in Recommender Systems: A Systematic Review and Future Directions

arXiv.org Artificial Intelligence

In light of the emergence of deep reinforcement learning (DRL) in recommender systems research and several fruitful results in recent years, this survey aims to provide a timely and comprehensive overview of the recent trends of deep reinforcement learning in recommender systems. We start with the motivation of applying DRL in recommender systems. Then, we provide a taxonomy of current DRL-based recommender systems and a summary of existing methods. We discuss emerging topics and open issues, and provide our perspective on advancing the domain. This survey serves as introductory material for readers from academia and industry into the topic and identifies notable opportunities for further research.


A brief history of AI: how to prevent another winter (a critical review)

arXiv.org Artificial Intelligence

The field of artificial intelligence (AI), regarded as one of the most enigmatic areas of science, has witnessed exponential growth in the past decade including a remarkably wide array of applications, having already impacted our everyday lives. Advances in computing power and the design of sophisticated AI algorithms have enabled computers to outperform humans in a variety of tasks, especially in the areas of computer vision and speech recognition. Yet, AI's path has never been smooth, having essentially fallen apart twice in its lifetime ('winters' of AI), both after periods of popular success ('summers' of AI). We provide a brief rundown of AI's evolution over the course of decades, highlighting its crucial moments and major turning points from inception to the present. In doing so, we attempt to learn, anticipate the future, and discuss what steps may be taken to prevent another 'winter'.


Boltzmann Machine

#artificialintelligence

Training problems: Given a set of binary data vectors, the machine must learn to predict the output vectors with high probability. The first step is to determine which layer connection weights have the lowest cost function values, relative to all the other possible binary vectors. The Boltzmann technique accomplishes this by continuously updating its own weights as each feature is processed, instead of treating the weights as a fixed value.


Distributed Allocation and Scheduling of Tasks with Cross-Schedule Dependencies for Heterogeneous Multi-Robot Teams

arXiv.org Artificial Intelligence

To enable safe and efficient use of multi-robot systems in everyday life, a robust and fast method for coordinating their actions must be developed. In this paper, we present a distributed task allocation and scheduling algorithm for missions where the tasks of different robots are tightly coupled with temporal and precedence constraints. The approach is based on representing the problem as a variant of the vehicle routing problem, and the solution is found using a distributed metaheuristic algorithm based on evolutionary computation (CBM-pop). Such an approach allows a fast and near-optimal allocation and can therefore be used for online replanning in case of task changes. Simulation results show that the approach has better computational speed and scalability without loss of optimality compared to the state-of-the-art distributed methods. An application of the planning procedure to a practical use case of a greenhouse maintained by a multi-robot system is given.


External knowledge transfer deployment inside a simple double agent Viterbi algorithm

arXiv.org Artificial Intelligence

Extracting ingredients from a recipe text is a very common activity especially for data scientists and developers who want to study recipes or want to make statistical representations about nutritive values of cuisine recipes. Ingredients is not the only useful information we want to extract, the quantity used for each ingredient and how they are prepared are also interesting informations that we can extract by the same method presented in this work. Hidden Markov Models are the first idea that came in my mind because there are previous successful works that used this method for information extraction ((Freitag & McCallum, 2000),(Freitag & McCallum, 1999),(Seymore, McCallum, Rosenfeld, et al., 1999),(Bikel, Miller, Schwartz, & Weischedel, 1998),(Leek, 1997)), and also because modeling sequences of words where we have to estimate the hidden state is typically a hidden Markov procedure. In this work we are concentrating on the external knowledge part deployed in what we called a simple double agent Viterbi algorithm.