AITopics | Markov Models

Collaborating Authors

Markov Models

News Overviews Instructional Materials AI-Alerts Classics

Learning Enhanced Ensemble Filters

Bach, Eviatar, Baptista, Ricardo, Calvello, Edoardo, Chen, Bohan, Stuart, Andrew

arXiv.org Machine LearningApr-24-2025

The filtering distribution in hidden Markov models evolves according to the law of a mean-field model in state--observation space. The ensemble Kalman filter (EnKF) approximates this mean-field model with an ensemble of interacting particles, employing a Gaussian ansatz for the joint distribution of the state and observation at each observation time. These methods are robust, but the Gaussian ansatz limits accuracy. This shortcoming is addressed by approximating the mean-field evolution using a novel form of neural operator taking probability distributions as input: a Measure Neural Mapping (MNM). A MNM is used to design a novel approach to filtering, the MNM-enhanced ensemble filter (MNMEF), which is defined in both the mean-fieldlimit and for interacting ensemble particle approximations. The ensemble approach uses empirical measures as input to the MNM and is implemented using the set transformer, which is invariant to ensemble permutation and allows for different ensemble sizes. The derivation of methods from a mean-field formulation allows a single parameterization of the algorithm to be deployed at different ensemble sizes. In practice fine-tuning of a small number of parameters, for specific ensemble sizes, further enhances the accuracy of the scheme. The promise of the approach is demonstrated by its superior root-mean-square-error performance relative to leading methods in filtering the Lorenz 96 and Kuramoto-Sivashinsky models.

artificial intelligence, ensemble size, machine learning, (18 more...)

arXiv.org Machine Learning

2504.17836

Country:

North America > United States > California (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Government > Regional Government (0.45)
Government > Military (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Learning Energy-Based Generative Models via Potential Flow: A Variational Principle Approach to Probability Density Homotopy Matching

Loo, Junn Yong, Adeline, Michelle, Lau, Julia Kaiwen, Leong, Fang Yu, Tew, Hwa Hui, Pal, Arghya, Baskaran, Vishnu Monn, Ting, Chee-Ming, Phan, Raphaël C. -W.

arXiv.org Artificial IntelligenceApr-24-2025

Energy-based models (EBMs) are a powerful class of probabilistic generative models due to their flexibility and interpretability. However, relationships between potential flows and explicit EBMs remain underexplored, while contrastive divergence training via implicit Markov chain Monte Carlo (MCMC) sampling is often unstable and expensive in high-dimensional settings. In this paper, we propose Variational Potential Flow Bayes (VPFB), a new energy-based generative framework that eliminates the need for implicit MCMC sampling and does not rely on auxiliary networks or cooperative training. VPFB learns an energy-parameterized potential flow by constructing a flow-driven density homotopy that is matched to the data distribution through a variational loss minimizing the Kullback-Leibler divergence between the flow-driven and marginal homotopies. This principled formulation enables robust and efficient generative modeling while preserving the interpretability of EBMs. Experimental results on image generation, interpolation, out-of-distribution detection, and compositional generation confirm the effectiveness of VPFB, showing that our method performs competitively with existing approaches in terms of sample quality and versatility across diverse generative modeling tasks. 1 1 Introduction

artificial intelligence, international conference, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2504.16262

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology (0.67)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
(2 more...)

Add feedback

Natural Policy Gradient for Average Reward Non-Stationary RL

Jali, Neharika, Pathak, Eshika, Sharma, Pranay, Qu, Guannan, Joshi, Gauri

arXiv.org Machine LearningApr-23-2025

We consider the problem of non-stationary reinforcement learning (RL) in the infinite-horizon average-reward setting. We model it by a Markov Decision Process with time-varying rewards and transition probabilities, with a variation budget of $\Delta_T$. Existing non-stationary RL algorithms focus on model-based and model-free value-based methods. Policy-based methods despite their flexibility in practice are not theoretically well understood in non-stationary RL. We propose and analyze the first model-free policy-based algorithm, Non-Stationary Natural Actor-Critic (NS-NAC), a policy gradient method with a restart based exploration for change and a novel interpretation of learning rates as adapting factors. Further, we present a bandit-over-RL based parameter-free algorithm BORL-NS-NAC that does not require prior knowledge of the variation budget $\Delta_T$. We present a dynamic regret of $\tilde{\mathscr O}(|S|^{1/2}|A|^{1/2}\Delta_T^{1/6}T^{5/6})$ for both algorithms, where $T$ is the time horizon, and $|S|$, $|A|$ are the sizes of the state and action spaces. The regret analysis leverages a novel adaptation of the Lyapunov function analysis of NAC to dynamic environments and characterizes the effects of simultaneous updates in policy, value function estimate and changes in the environment.

machine learning, natural policy gradient, reinforcement learning, (14 more...)

arXiv.org Machine Learning

2504.16415

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.40)

Industry: Energy (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback

An Extended Horizon Tactical Decision-Making for Automated Driving Based on Monte Carlo Tree Search

Essalmi, Karim, Garrido, Fernando, Nashashibi, Fawzi

arXiv.org Artificial IntelligenceApr-23-2025

-- This paper introduces COR-MCTS (Conservation of Resources - Monte Carlo Tree Search), a novel tactical decision-making approach for automated driving focusing on maneuver planning over extended horizons. Traditional decision-making algorithms are often constrained by fixed planning horizons, typically up to 6 seconds for classical approaches and 3 seconds for learning-based methods limiting their adaptability in particular dynamic driving scenarios. However, planning must be done well in advance in environments such as highways, roundabouts, and exits to ensure safe and efficient maneuvers. T o address this challenge, we propose a hybrid method integrating Monte Carlo Tree Search (MCTS) with our prior utility-based framework, COR-MP (Conservation of Resources Model for Maneuver Planning). This combination enables long-term, real-time decision-making, significantly enhancing the ability to plan a sequence of maneuvers over extended horizons. Through simulations across diverse driving scenarios, we demonstrate that COR-MCTS effectively improves planning robustness and decision efficiency over extended horizons. The deployment of self-driving cars offers numerous benefits, such as improved transportation mobility, enhanced vehicle efficiency in terms of fuel consumption, and better traffic flow management [1], [2]. However, significant challenges remain before fully autonomous vehicles can be integrated into daily life.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2504.15869

Country: Europe > France (0.14)

Genre: Research Report (0.82)

Industry:

Transportation > Ground > Road (1.00)
Government > Military (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Symbolic Runtime Verification and Adaptive Decision-Making for Robot-Assisted Dressing

Rafiq, Yasmin, Vázquez, Gricel, Calinescu, Radu, Dogramadzi, Sanja, Hierons, Robert M

arXiv.org Artificial IntelligenceApr-23-2025

We present a control framework for robot-assisted dressing that augments low-level hazard response with runtime monitoring and formal verification. A parametric discrete-time Markov chain (pDTMC) models the dressing process, while Bayesian inference dynamically updates this pDTMC's transition probabilities based on sensory and user feedback. Safety constraints from hazard analysis are expressed in probabilistic computation tree logic, and symbolically verified using a probabilistic model checker. We evaluate reachability, cost, and reward trade-offs for garment-snag mitigation and escalation, enabling real-time adaptation. Our approach provides a formal yet lightweight foundation for safety-aware, explainable robotic assistance.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2504.15666

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Accelerating Visual Reinforcement Learning with Separate Primitive Policy for Peg-in-Hole Tasks

Xu, Zichun, Wang, Zhaomin, Li, Yuntao, Zhuang, Lei, Zhao, Zhiyuan, Yang, Guocai, Zhao, Jingdong

arXiv.org Artificial IntelligenceApr-22-2025

For peg-in-hole tasks, humans rely on binocular visual perception to locate the peg above the hole surface and then proceed with insertion. This paper draws insights from this behavior to enable agents to learn efficient assembly strategies through visual reinforcement learning. Hence, we propose a Separate Primitive Policy (S2P) to simultaneously learn how to derive location and insertion actions. S2P is compatible with model-free reinforcement learning algorithms. Ten insertion tasks featuring different polygons are developed as benchmarks for evaluations. Simulation experiments show that S2P can boost the sample efficiency and success rate even with force constraints. Real-world experiments are also performed to verify the feasibility of S2P. Ablations are finally given to discuss the generalizability of S2P and some factors that affect its performance.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2504.1482

Country: Asia > China (0.29)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Advanced posterior analyses of hidden Markov models: finite Markov chain imbedding and hybrid decoding

Bæk, Zenia Elise Damgaard, Macià, Moisès Coll, Skov, Laurits, Hobolth, Asger

arXiv.org Machine LearningApr-21-2025

Two major tasks in applications of hidden Markov models are to (i) com pute distributions of summary statistics of the hidden state sequence, and (ii) decode the hidden state sequence. We describe finite Markov chain imbedding (FMCI) and hybrid decoding to solve each of t hese two tasks. In the first part of our paper we use FMCI to compute posterior distributions o f summary statistics such as the number of visits to a hidden state, the total time spent in a hidden st ate, the dwell time in a hidden state, and the longest run length. We use simulations from the hidde n state sequence, conditional on the observed sequence, to establish the FMCI framework. In the second part of our paper we apply hybrid segmentation for improved decoding of a HMM. We demonstra te that hybrid decoding shows increased performance compared to Viterbi or Posterior decodin g (often also referred to as global or local decoding), and we introduce a novel procedure for choosing the tuning parameter in the hybrid procedure. Furthermore, we provide an alternative derivation of the hybrid loss function based on weighted geometric means. We demonstrate and apply FMCI and hyb rid decoding on various classical data sets, and supply accompanying code for reproducibility. Key words: Artemis analysis, decoding, finite Markov chain imbedding, hidden Mar kov model, hybrid decoding, pattern distributions.

artificial intelligence, machine learning, probability, (18 more...)

arXiv.org Machine Learning

2504.15156

Country:

Europe > Estonia > Tartu County > Tartu (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Controlled Territory and Conflict Tracking (CONTACT): (Geo-)Mapping Occupied Territory from Open Source Intelligence

Mandal, Paul K., Leo, Cole, Hurley, Connor

arXiv.org Artificial IntelligenceApr-21-2025

Open-source intelligence provides a stream of unstructured textual data that can inform assessments of territorial control. We present CONTACT, a framework for territorial control prediction using large language models (LLMs) and minimal supervision. We evaluate two approaches: SetFit, an embedding-based few-shot classifier, and a prompt tuning method applied to BLOOMZ-560m, a multilingual generative LLM. Our model is trained on a small hand-labeled dataset of news articles covering ISIS activity in Syria and Iraq, using prompt-conditioned extraction of control-relevant signals such as military operations, casualties, and location references. We show that the BLOOMZ-based model outperforms the SetFit baseline, and that prompt-based supervision improves generalization in low-resource settings. CONTACT demonstrates that LLMs fine-tuned using few-shot methods can reduce annotation burdens and support structured inference from open-ended OSINT streams. Our code is available at https://github.com/PaulKMandal/CONTACT/.

large language model, machine learning, natural language, (13 more...)

arXiv.org Artificial Intelligence

2504.1373

Country:

Asia > Middle East > Syria (0.34)
Asia > Middle East > Iraq (0.34)
North America > United States > Texas (0.28)

Genre: Research Report (1.00)

Industry: Government > Military (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

MSTIM: A MindSpore-Based Model for Traffic Flow Prediction

Qin, Weiqi, Liu, Yuxin, Wu, Dongze, Qin, Zhenkai, Luo, Qining

arXiv.org Artificial IntelligenceApr-21-2025

Aiming at the problems of low accuracy and large error fluctuation of traditional traffic flow predictionmodels when dealing with multi-scale temporal features and dynamic change patterns. this paperproposes a multi-scale time series information modelling model MSTIM based on the Mindspore framework, which integrates long and short-term memory networks (LSTMs), convolutional neural networks (CNN), and the attention mechanism to improve the modelling accuracy and stability. The Metropolitan Interstate Traffic Volume (MITV) dataset was used for the experiments and compared and analysed with typical LSTM-attention models, CNN-attention models and LSTM-CNN models. The experimental results show that the MSTIM model achieves better results in the metrics of Mean Absolute Error (MAE), Mean Square Error (MSE), and Root Mean Square Error (RMSE), which significantly improves the accuracy and stability of the traffic volume prediction.

artificial intelligence, machine learning, prediction, (17 more...)

arXiv.org Artificial Intelligence

2504.13576

Country:

Asia (0.15)
North America (0.14)

Genre: Research Report > New Finding (0.89)

Industry:

Consumer Products & Services > Travel (0.82)
Transportation > Ground > Road (0.68)
Transportation > Infrastructure & Services (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

RL-PINNs: Reinforcement Learning-Driven Adaptive Sampling for Efficient Training of PINNs

Song, Zhenao

arXiv.org Artificial IntelligenceApr-18-2025

Physics-Informed Neural Networks (PINNs) have emerged as a powerful framework for solving partial differential equations (PDEs). However, their performance heavily relies on the strategy used to select training points. Conventional adaptive sampling methods, such as residual-based refinement, often require multi-round sampling and repeated retraining of PINNs, leading to computational inefficiency due to redundant points and costly gradient computations-particularly in high-dimensional or high-order derivative scenarios. To address these limitations, we propose RL-PINNs, a reinforcement learning(RL)-driven adaptive sampling framework that enables efficient training with only a single round of sampling. Our approach formulates adaptive sampling as a Markov decision process, where an RL agent dynamically selects optimal training points by maximizing a long-term utility metric. Critically, we replace gradient-dependent residual metrics with a computationally efficient function variation as the reward signal, eliminating the overhead of derivative calculations. Furthermore, we employ a delayed reward mechanism to prioritize long-term training stability over short-term gains. Extensive experiments across diverse PDE benchmarks, including low-regular, nonlinear, high-dimensional, and high-order problems, demonstrate that RL-PINNs significantly outperforms existing residual-driven adaptive methods in accuracy. Notably, RL-PINNs achieve this with negligible sampling overhead, making them scalable to high-dimensional and high-order problems.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2504.12949

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback