AITopics | Markov Models

Collaborating Authors

Markov Models

News Overviews Instructional Materials AI-Alerts Classics

Sampling Through the Lens of Sequential Decision Making

Dou, Jason Xiaotian, Pan, Alvin Qingkai, Bao, Runxue, Mao, Haiyi Harry, Luo, Lei, Mao, Zhi-Hong

arXiv.org Artificial IntelligenceDec-13-2022

Sampling is ubiquitous in machine learning methodologies. Due to the growth of large datasets and model complexity, we want to learn and adapt the sampling process while training a representation. Towards achieving this grand goal, a variety of sampling techniques have been proposed. However, most of them either use a fixed sampling scheme or adjust the sampling scheme based on simple heuristics. They cannot choose the best sample for model training in different stages. Inspired by "Think, Fast and Slow" (System 1 and System 2) in cognitive science, we propose a reward-guided sampling strategy called Adaptive Sample with Reward (ASR) to tackle this challenge. To the best of our knowledge, this is the first work utilizing reinforcement learning (RL) to address the sampling problem in representation learning. Our approach optimally adjusts the sampling process to achieve optimal performance. We explore geographical relationships among samples by distance-based sampling to maximize overall cumulative reward. We apply ASR to the long-standing sampling problems in similarity-based loss functions. Empirical results in information retrieval and clustering demonstrate ASR's superb performance across different datasets. We also discuss an engrossing phenomenon which we name as "ASR gravity well" in experiments.

machine learning, natural language, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.13140/RG.2.2.21905.92008

2208.08056

Country:

Asia > Middle East > Jordan (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Categorical Tools for Natural Language Processing

de Felice, Giovanni

arXiv.org Artificial IntelligenceDec-13-2022

This thesis develops the translation between category theory and computational linguistics as a foundation for natural language processing. The three chapters deal with syntax, semantics and pragmatics. First, string diagrams provide a unified model of syntactic structures in formal grammars. Second, functors compute semantics by turning diagrams into logical, tensor, neural or quantum computation. Third, the resulting functorial models can be composed to form games where equilibria are the solutions of language processing tasks. This framework is implemented as part of DisCoPy, the Python library for computing with string diagrams. We describe the correspondence between categorical, linguistic and computational structures, and demonstrate their applications in compositional natural language processing.

logic & formal reasoning, machine learning, natural language, (25 more...)

arXiv.org Artificial Intelligence

2212.06636

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.13)
North America > United States > California > San Francisco County > San Francisco (0.13)
(33 more...)

Genre:

Overview (0.92)
Summary/Review (0.67)
Instructional Material > Course Syllabus & Notes (0.67)
Research Report (0.63)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(5 more...)

Add feedback

Reinforcement Learning and Tree Search Methods for the Unit Commitment Problem

de Mars, Patrick

arXiv.org Artificial IntelligenceDec-12-2022

The unit commitment (UC) problem, which determines operating schedules of generation units to meet demand, is a fundamental task in power systems operation. Existing UC methods using mixed-integer programming are not well-suited to highly stochastic systems. Approaches which more rigorously account for uncertainty could yield large reductions in operating costs by reducing spinning reserve requirements; operating power stations at higher efficiencies; and integrating greater volumes of variable renewables. A promising approach to solving the UC problem is reinforcement learning (RL), a methodology for optimal decision-making which has been used to conquer long-standing grand challenges in artificial intelligence. This thesis explores the application of RL to the UC problem and addresses challenges including robustness under uncertainty; generalisability across multiple problem instances; and scaling to larger power systems than previously studied. To tackle these issues, we develop guided tree search, a novel methodology combining model-free RL and model-based planning. The UC problem is formalised as a Markov decision process and we develop an open-source environment based on real data from Great Britain's power system to train RL agents. In problems of up to 100 generators, guided tree search is shown to be competitive with deterministic UC methods, reducing operating costs by up to 1.4\%. An advantage of RL is that the framework can be easily extended to incorporate considerations important to power systems operators such as robustness to generator failure, wind curtailment or carbon prices. When generator outages are considered, guided tree search saves over 2\% in operating costs as compared with methods using conventional $N-x$ reserve criteria.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2212.06001

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

Add feedback

Sequential Density Estimation via Nonlinear Continuous Weighted Finite Automata

Li, Tianyu, Mazoure, Bogdan, Rabusseau, Guillaume

arXiv.org Artificial IntelligenceDec-12-2022

Weighted finite automata (WFAs) have been widely applied in many fields. One of the classic problems for WFAs is probability distribution estimation over sequences of discrete symbols. Although WFAs have been extended to deal with continuous input data, namely continuous WFAs (CWFAs), it is still unclear how to approximate density functions over sequences of continuous random variables using WFA-based models, due to the limitation on the expressiveness of the model as well as the tractability of approximating density functions via CWFAs. In this paper, we propose a nonlinear extension to the CWFA model to first improve its expressiveness, we refer to it as the nonlinear continuous WFAs (NCWFAs). Then we leverage the so-called RNADE method, which is a well-known density estimator based on neural networks, and propose the RNADE-NCWFA model. The RNADE-NCWFA model computes a density function by design. We show that this model is strictly more expressive than the Gaussian HMM model, which CWFA cannot approximate. Empirically, we conduct a synthetic experiment using Gaussian HMM generated data. We focus on evaluating the model's ability to estimate densities for sequences of varying lengths (longer length than the training data). We observe that our model performs the best among the compared baseline methods.

artificial intelligence, machine learning, sequence, (13 more...)

arXiv.org Artificial Intelligence

2206.03923

Country:

North America > Canada > Quebec > Montreal (0.14)
Africa > Senegal > Kolda Region > Kolda (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

A Review of Off-Policy Evaluation in Reinforcement Learning

Uehara, Masatoshi, Shi, Chengchun, Kallus, Nathan

arXiv.org Artificial IntelligenceDec-12-2022

Reinforcement learning (RL) is one of the most vibrant research frontiers in machine learning and has been recently applied to solve a number of challenging problems. In this paper, we primarily focus on off-policy evaluation (OPE), one of the most fundamental topics in RL. In recent years, a number of OPE methods have been developed in the statistics and computer science literature. We provide a discussion on the efficiency bound of OPE, some of the existing state-of-the-art OPE methods, their statistical properties and some other related research directions that are currently actively explored.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2212.06355

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Washington > King County > Seattle (0.04)
(5 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.92)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

Targeted Adversarial Attacks on Deep Reinforcement Learning Policies via Model Checking

Gross, Dennis, Simao, Thiago D., Jansen, Nils, Perez, Guillermo A.

arXiv.org Artificial IntelligenceDec-10-2022

Deep Reinforcement Learning (RL) agents are susceptible to adversarial noise in their observations that can mislead their policies and decrease their performance. However, an adversary may be interested not only in decreasing the reward, but also in modifying specific temporal logic properties of the policy. This paper presents a metric that measures the exact impact of adversarial attacks against such properties. We use this metric to craft optimal adversarial attacks. Furthermore, we introduce a model checking method that allows us to verify the robustness of RL policies against adversarial attacks. Our empirical analysis confirms (1) the quality of our metric to craft adversarial attacks against temporal logic properties, and (2) that we are able to concisely assess a system's robustness against attacks.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2212.05337

Country:

Europe > Netherlands > Gelderland > Nijmegen (0.04)
Europe > Belgium > Flanders > Antwerp Province > Antwerp (0.04)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Neural Continuous-Time Markov Models

Reeves, Majerle, Bhat, Harish S.

arXiv.org Artificial IntelligenceDec-10-2022

Continuous-time Markov chains are used to model stochastic systems where transitions can occur at irregular times, e.g., birth-death processes, chemical reaction networks, population dynamics, and gene regulatory networks. We develop a method to learn a continuous-time Markov chain's transition rate functions from fully observed time series. In contrast with existing methods, our method allows for transition rates to depend nonlinearly on both state variables and external covariates. The Gillespie algorithm is used to generate trajectories of stochastic systems where propensity functions (reaction rates) are known. Our method can be viewed as the inverse: given trajectories of a stochastic reaction network, we generate estimates of the propensity functions. While previous methods used linear or log-linear methods to link transition rates to covariates, we use neural networks, increasing the capacity and potential accuracy of learned models. In the chemical context, this enables the method to learn propensity functions from non-mass-action kinetics. We test our method with synthetic data generated from a variety of systems with known transition rates. We show that our method learns these transition rates with considerably more accuracy than log-linear methods, in terms of mean absolute error between ground truth and predicted transition rates. We also demonstrate an application of our methods to open-loop control of a continuous-time Markov chain.

artificial intelligence, machine learning, propensity function, (14 more...)

arXiv.org Artificial Intelligence

2212.05378

Country:

North America > United States > California > Merced County > Merced (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Reducing Collision Risk in Multi-Agent Path Planning: Application to Air traffic Management

Li, Sarah H. Q., Mittal, Avi, Garoche, Pierre-Loïc, Açıkmeşe, null, Behçet, null

arXiv.org Artificial IntelligenceDec-10-2022

To minimize collision risks in the multi-agent path planning problem with stochastic transition dynamics, we formulate a Markov decision process congestion game with a multi-linear congestion cost. Players within the game complete individual tasks while minimizing their own collision risks. We show that the set of Nash equilibria coincides with the first-order KKT points of a non-convex optimization problem. Our game is applied to a historical flight plan over France to reduce collision risks between commercial aircraft.

artificial intelligence, collision risk, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2212.04122

Country:

North America > United States > Washington > King County > Seattle (0.15)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.05)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)

Genre: Research Report (0.40)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Air (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.91)

Add feedback

Task-Directed Exploration in Continuous POMDPs for Robotic Manipulation of Articulated Objects

Curtis, Aidan, Kaelbling, Leslie, Jain, Siddarth

arXiv.org Artificial IntelligenceDec-8-2022

Representing and reasoning about uncertainty is crucial for autonomous agents acting in partially observable environments with noisy sensors. Partially observable Markov decision processes (POMDPs) serve as a general framework for representing problems in which uncertainty is an important factor. Online sample-based POMDP methods have emerged as efficient approaches to solving large POMDPs and have been shown to extend to continuous domains. However, these solutions struggle to find long-horizon plans in problems with significant uncertainty. Exploration heuristics can help guide planning, but many real-world settings contain significant task-irrelevant uncertainty that might distract from the task objective. In this paper, we propose STRUG, an online POMDP solver capable of handling domains that require long-horizon planning with significant task-relevant and task-irrelevant uncertainty. We demonstrate our solution on several temporally extended versions of toy POMDP problems as well as robotic manipulation of articulated objects using a neural perception frontend to construct a distribution of possible models. Our results show that STRUG outperforms the current sample-based online POMDP solvers on several tasks.

artificial intelligence, machine learning, pomdp solver, (15 more...)

arXiv.org Artificial Intelligence

2212.04554

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Arizona > Coconino County > Flagstaff (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Monocular Camera and Single-Beam Sonar-Based Underwater Collision-Free Navigation with Domain Randomization

Yang, Pengzhi, Liu, Haowen, Roznere, Monika, Li, Alberto Quattrini

arXiv.org Artificial IntelligenceDec-8-2022

Underwater navigation presents several challenges, including unstructured unknown environments, lack of reliable localization systems (e.g., GPS), and poor visibility. Furthermore, good-quality obstacle detection sensors for underwater robots are scant and costly; and many sensors like RGB-D cameras and LiDAR only work in-air. To enable reliable mapless underwater navigation despite these challenges, we propose a low-cost end-to-end navigation system, based on a monocular camera and a fixed single-beam echo-sounder, that efficiently navigates an underwater robot to waypoints while avoiding nearby obstacles. Our proposed method is based on Proximal Policy Optimization (PPO), which takes as input current relative goal information, estimated depth images, echo-sounder readings, and previous executed actions, and outputs 3D robot actions in a normalized scale. End-to-end training was done in simulation, where we adopted domain randomization (varying underwater conditions and visibility) to learn a robust policy against noise and changes in visibility conditions. The experiments in simulation and real-world demonstrated that our proposed method is successful and resilient in navigating a low-cost underwater robot in unknown underwater environments.

artificial intelligence, machine learning, robot, (17 more...)

arXiv.org Artificial Intelligence

2212.04373

Country:

Atlantic Ocean > Caribbean Sea (0.04)
South America > Uruguay > Maldonado > Maldonado (0.04)
North America > United States > New Hampshire > Grafton County > Hanover (0.04)
Asia > China > Sichuan Province (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback