AITopics | Markov Models

Collaborating Authors

Markov Models

News Overviews Instructional Materials AI-Alerts Classics

Fitting Sparse Markov Models to Categorical Time Series Using Regularization

Majumder, Tuhin, Lahiri, Soumendra, Martin, Donald

arXiv.org Machine LearningFeb-11-2022

The major problem of fitting a higher order Markov model is the exponentially growing number of parameters. The most popular approach is to use a Variable Length Markov Chain (VLMC), which determines relevant contexts (recent pasts) of variable orders and form a context tree. A more general approach is called Sparse Markov Model (SMM), where all possible histories of order $m$ form a partition so that the transition probability vectors are identical for the histories belonging to a particular group. We develop an elegant method of fitting SMM using convex clustering, which involves regularization. The regularization parameter is selected using BIC criterion. Theoretical results demonstrate the model selection consistency of our method for large sample size. Extensive simulation studies under different set-up have been presented to measure the performance of our method. We apply this method to classify genome sequences, obtained from individuals affected by different viruses.

probability, probability vector, sequence, (13 more...)

arXiv.org Machine Learning

2202.05485

Country:

North America > United States > North Carolina (0.04)
Europe > Sweden > Östergötland County > Linköping (0.04)

Genre: Research Report > New Finding (0.88)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Uncertainty Aware System Identification with Universal Policies

Semage, Buddhika Laknath, Karimpanal, Thommen George, Rana, Santu, Venkatesh, Svetha

arXiv.org Artificial IntelligenceFeb-11-2022

Sim2real transfer is primarily concerned with transferring policies trained in simulation to potentially noisy real world environments. A common problem associated with sim2real transfer is estimating the real-world environmental parameters to ground the simulated environment to. Although existing methods such as Domain Randomisation (DR) can produce robust policies by sampling from a distribution of parameters during training, there is no established method for identifying the parameters of the corresponding distribution for a given real-world setting. In this work, we propose Uncertainty-aware policy search (UncAPS), where we use Universal Policy Network (UPN) to store simulation-trained task-specific policies across the full range of environmental parameters and then subsequently employ robust Bayesian optimisation to craft robust policies for the given environment by combining relevant UPN policies in a DR like fashion. Such policy-driven grounding is expected to be more efficient as it estimates only task-relevant sets of parameters. Further, we also account for the estimation uncertainties in the search process to produce policies that are robust against both aleatoric and epistemic uncertainties. We empirically evaluate our approach in a range of noisy, continuous control environments, and show its improved performance compared to competing baselines.

international conference, policy search, simulation parameter, (14 more...)

arXiv.org Artificial Intelligence

2202.05844

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)
Oceania > Australia (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Answer Set Planning: A Survey

Son, Tran Cao, Pontelli, Enrico, Balduccini, Marcello, Schaub, Torsten

arXiv.org Artificial IntelligenceFeb-11-2022

Answer Set Planning refers to the use of Answer Set Programming (ASP) to compute plans, i.e., solutions to planning problems, that transform a given state of the world to another state. The development of efficient and scalable answer set solvers has provided a significant boost to the development of ASP-based planning systems. This paper surveys the progress made during the last two and a half decades in the area of answer set planning, from its foundations to its use in challenging planning domains. The survey explores the advantages and disadvantages of answer set planning. It also discusses typical applications of answer set planning and presents a set of challenges for future research.

artificial intelligence, international conference, proceedings, (13 more...)

arXiv.org Artificial Intelligence

2202.05793

Country:

Europe > Germany > Brandenburg > Potsdam (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New Mexico (0.04)
(5 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Transportation (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
(4 more...)

Add feedback

Online Decision Transformer

Zheng, Qinqing, Zhang, Amy, Grover, Aditya

arXiv.org Artificial IntelligenceFeb-11-2022

Generative pretraining for sequence modeling has emerged as a unifying paradigm for machine learning in a number of domains and modalities, notably in language and vision (Radford et al., 2018; Chen et al., 2020; Brown et al., 2020; Lu et al., 2022). Recently, such a pretraining paradigm has been extended to offline reinforcement learning (RL) (Chen et al., 2021; Janner et al., 2021), wherein an agent is trained to autoregressively maximize the likelihood of trajectories in the offline dataset. During training, this paradigm essentially converts offline RL to a supervised learning problem (Schmidhuber, 2019; Srivastava et al., 2019; Emmons et al., 2021). However, these works present an incomplete picture as policies learned via offline RL are limited by the quality of the training dataset and need to be finetuned to the task of interest via online interactions. It remains an open question whether such supervised learning paradigm can be extended to online settings. Unlike language and perception, online finetuning for RL is fundamentally different from the pretraining phase as it involves data acquisition via exploration. The need for exploration renders traditional supervised learning objectives (e.g., mean squared error) for offline RL insufficient in the online setting. Moreover, it has been observed that for standard online algorithms, access to offline data can often have zero or even negative effect on the online performance (Nair et al., 2020). Hence, the overall pipeline for offline pretraining followed by online finetuning for RL policies needs a careful consideration of training objectives and protocols.

odt, online, trajectory, (13 more...)

arXiv.org Artificial Intelligence

2202.05607

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Indiana (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Active Privacy-Utility Trade-off Against Inference in Time-Series Data Sharing

Erdemir, Ecenaz, Dragotti, Pier Luigi, Gunduz, Deniz

arXiv.org Machine LearningFeb-11-2022

Internet of things (IoT) devices, such as smart meters, smart speakers and activity monitors, have become highly popular thanks to the services they offer. However, in addition to their many benefits, they raise privacy concerns since they share fine-grained time-series user data with untrusted third parties. In this work, we consider a user releasing her data containing personal information in return of a service from an honest-but-curious service provider (SP). We model user's personal information as two correlated random variables (r.v.'s), one of them, called the secret variable, is to be kept private, while the other, called the useful variable, is to be disclosed for utility. We consider active sequential data release, where at each time step the user chooses from among a finite set of release mechanisms, each revealing some information about the user's personal information, i.e., the true values of the r.v.'s, albeit with different statistics. The user manages data release in an online fashion such that the maximum amount of information is revealed about the latent useful variable as quickly as possible, while the confidence for the sensitive variable is kept below a predefined level. For privacy measure, we consider both the probability of correctly detecting the true value of the secret and the mutual information (MI) between the secret and the released data. We formulate both problems as partially observable Markov decision processes (POMDPs), and numerically solve them by advantage actor-critic (A2C) deep reinforcement learning (DRL). We evaluate the privacy-utility trade-off (PUT) of the proposed policies on both the synthetic data and smoking activity dataset, and show their validity by testing the activity detection accuracy of the SP modeled by a long short-term memory (LSTM) neural network.

accuracy, information, privacy, (15 more...)

arXiv.org Machine Learning

2202.05833

Country:

North America > United States (0.04)
Europe > United Kingdom (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Long-Time Convergence and Propagation of Chaos for Nonlinear MCMC

Vuckovic, James

arXiv.org Machine LearningFeb-11-2022

In this paper, we study the long-time convergence and uniform strong propagation of chaos for a class of nonlinear Markov chains for Markov chain Monte Carlo (MCMC). Our technique is quite simple, making use of recent contraction estimates for linear Markov kernels and basic techniques from Markov theory and analysis. Moreover, the same proof strategy applies to both the long-time convergence and propagation of chaos. We also show, via some experiments, that these nonlinear MCMC techniques are viable for use in real-world high-dimensional inference such as Bayesian neural networks.

algorithm, markov chain, propagation, (16 more...)

arXiv.org Machine Learning

2202.05621

Country: North America > United States > New York (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.56)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

FABRIC: A Framework for the Design and Evaluation of Collaborative Robots with Extended Human Adaptation

Görür, O. Can, Rosman, Benjamin, Sivrikaya, Fikret, Albayrak, Sahin

arXiv.org Artificial IntelligenceFeb-11-2022

A limitation for collaborative robots (cobots) is their lack of ability to adapt to human partners, who typically exhibit an immense diversity of behaviors. We present an autonomous framework as a cobot's real-time decision-making mechanism to anticipate a variety of human characteristics and behaviors, including human errors, toward a personalized collaboration. Our framework handles such behaviors in two levels: 1) short-term human behaviors are adapted through our novel Anticipatory Partially Observable Markov Decision Process (A-POMDP) models, covering a human's changing intent (motivation), availability, and capability; 2) long-term changing human characteristics are adapted by our novel Adaptive Bayesian Policy Selection (ABPS) mechanism that selects a short-term decision model, e.g., an A-POMDP, according to an estimate of a human's workplace characteristics, such as her expertise and collaboration preferences. To design and evaluate our framework over a diversity of human behaviors, we propose a pipeline where we first train and rigorously test the framework in simulation over novel human models. Then, we deploy and evaluate it on our novel physical experiment setup that induces cognitive load on humans to observe their dynamic behaviors, including their mistakes, and their changing characteristics such as their expertise. We conduct user studies and show that our framework effectively collaborates non-stop for hours and adapts to various changing human behaviors and characteristics in real-time. That increases the efficiency and naturalness of the collaboration with a higher perceived collaboration, positive teammate traits, and human trust. We believe that such an extended human adaptation is key to the long-term use of cobots.

artificial intelligence, machine learning, real time system, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3585276

2104.01976

Country:

North America > United States (0.28)
Europe > Germany (0.28)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Overview (1.00)

Industry: Health & Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Architecture > Real Time Systems (1.00)

Add feedback

Learning Temporal Rules from Noisy Timeseries Data

Samel, Karan, Zhao, Zelin, Chen, Binghong, Li, Shuang, Subramanian, Dharmashankar, Essa, Irfan, Song, Le

arXiv.org Artificial IntelligenceFeb-10-2022

Events across a timeline are a common data representation, seen in different temporal modalities. Individual atomic events can occur in a certain temporal ordering to compose higher level composite events. Examples of a composite event are a patient's medical symptom or a baseball player hitting a home run, caused distinct temporal orderings of patient vitals and player movements respectively. Such salient composite events are provided as labels in temporal datasets and most works optimize models to predict these composite event labels directly. We focus on uncovering the underlying atomic events and their relations that lead to the composite events within a noisy temporal data setting. We propose Neural Temporal Logic Programming (Neural TLP) which first learns implicit temporal relations between atomic events and then lifts logic rules for composite events, given only the composite events labels for supervision. This is done through efficiently searching through the combinatorial space of all temporal logic rules in an end-to-end differentiable manner. We evaluate our method on video and healthcare datasets where it outperforms the baseline methods for rule discovery.

artificial intelligence, logic & formal reasoning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2202.05403

Country: Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Leisure & Entertainment > Sports > Baseball (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(2 more...)

Add feedback

Learning in Restless Bandits under Exogenous Global Markov Process

Gafni, Tomer, Yemini, Michal, Cohen, Kobi

arXiv.org Artificial IntelligenceFeb-10-2022

We consider an extension to the restless multi-armed bandit (RMAB) problem with unknown arm dynamics, where an unknown exogenous global Markov process governs the rewards distribution of each arm. Under each global state, the rewards process of each arm evolves according to an unknown Markovian rule, which is non-identical among different arms. At each time, a player chooses an arm out of $N$ arms to play, and receives a random reward from a finite set of reward states. The arms are restless, that is, their local state evolves regardless of the player's actions. Motivated by recent studies on related RMAB settings, the regret is defined as the reward loss with respect to a player that knows the dynamics of the problem, and plays at each time $t$ the arm that maximizes the expected immediate value. The objective is to develop an arm-selection policy that minimizes the regret. To that end, we develop the Learning under Exogenous Markov Process (LEMP) algorithm. We analyze LEMP theoretically and establish a finite-sample bound on the regret. We show that LEMP achieves a logarithmic regret order with time. We further analyze LEMP numerically and present simulation results that support the theoretical findings and demonstrate that LEMP significantly outperforms alternative algorithms.

artificial intelligence, machine learning, probability, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TSP.2022.3224790

2112.09484

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > Canada (0.04)
Asia > Middle East > Israel > Southern District > Beer-Sheva (0.04)

Genre: Research Report (0.84)

Industry: Energy (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Migrating Techniques from Search-based Multi-Agent Path Finding Solvers to SAT-based Approach

Surynek, Pavel, Stern, Roni, Boyarski, Eli, Felner, Ariel

Journal of Artificial Intelligence ResearchFeb-10-2022

In the multi-agent path finding problem (MAPF) we are given a set of agents each with respective start and goal positions. The task is to find paths for all agents while avoiding collisions, aiming to minimize a given objective function. Many MAPF solvers were introduced in the past decade for optimizing two specific objective functions: sum-of-costs and makespan. Two prominent categories of solvers can be distinguished: search-based solvers and compilation-based solvers. Search-based solvers were developed and tested for the sum-of-costs objective, while the most prominent compilation-based solvers that are built around Boolean satisfiability (SAT) were designed for the makespan objective. Very little is known on the performance and relevance of solvers from the compilation-based approach on the sum-of-costs objective. In this paper, we start to close the gap between these cost functions in the compilation-based approach. Our main contribution is a new SAT-based MAPF solver called MDD-SAT, that is directly aimed to optimally solve the MAPF problem under the sum-of-costs objective function. Using both a lower bound on the sum-of-costs and an upper bound on the makespan, MDD-SAT is able to generate a reasonable number of Boolean variables in our SAT encoding. We then further improve the encoding by borrowing ideas from ICTS, a search-based solver. In addition, we show that concepts applicable in search-based solvers like ICTS and ICBS are applicable in the SAT-based approach as well. Specifically, we integrate independence detection, a generic technique for decomposing an MAPF instance into independent subproblems, into our SAT-based approach, and we design a relaxation of our optimal SAT-based solver that results in a bounded suboptimal SAT-based solver. Experimental evaluation on several domains shows that there are many scenarios where our SAT-based methods outperform state-of-the-art sum-of-costs search-based solvers, such as variants of the ICTS and ICBS algorithms.

agent, algorithm, solver, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.13318

AI Access Foundation

13318

Journal of Artificial Intelligence Research

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Slovenia > Central Slovenia > Municipality of Komenda > Komenda (0.04)
(26 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.67)

Industry:

Energy (0.45)
Leisure & Entertainment > Games (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback