AITopics | Markov Models

Contrastive Active Inference

Neural Information Processing SystemsAug-15-2025, 04:56:12 GMT

In contrast, reinforcement learning requires human-designed rewards to accomplish any desired outcome.

agent, inference, trajectory, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Italy > Sardinia (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(3 more...)

Add feedback

Provably Efficient Reinforcement Learning with Linear Function Approximation under Adaptivity Constraints

Neural Information Processing SystemsAug-15-2025, 03:12:18 GMT

Real-world reinforcement learning (RL) applications often come with possibly infinite state and action space, and in such a situation classical RL algorithms developed in the tabular setting are not applicable anymore. A popular approach to overcoming this issue is by applying function approximation techniques to the underlying structures of the Markov decision processes (MDPs).

algorithm, batch, rare policy switch model, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Connecticut > New Haven County > New Haven (0.04)

Genre: Research Report > New Finding (0.68)

Industry:

Education (0.47)
Health & Medicine (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Bayesian Optimistic Optimization: Optimistic

Neural Information Processing SystemsAug-15-2025, 02:35:22 GMT

In this paper, we consider the RL in Markov decision processes (MDPs), where the agent observes the state of the environment at each timestep and makes decisions accordingly.

algorithm, neural information processing system, optimization, (11 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Asia > China > Jiangsu Province > Nanjing (0.04)
Europe > United Kingdom > England > Surrey > Guildford (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

A Compositional Atlas of Tractable Circuit Operations for Probabilistic Inference

Neural Information Processing SystemsAug-15-2025, 01:38:49 GMT

Circuit representations are becoming the lingua franca to express and reason about tractable generative and discriminative models.

algorithm, decomposable circuit, query, (16 more...)

Neural Information Processing Systems

Country:

Europe > Middle East > Malta > Port Region > Southern Harbour District > Floriana (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > California (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Restless-UCB, an Efficient and Low-complexity Algorithm for Online Restless Bandits

Neural Information Processing SystemsAug-15-2025, 01:02:11 GMT

As a result, our analysis technique can also be adopted to tighten the regret bounds of existing algorithms.

bandit problem, restless bandit problem, restless-ucb, (15 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > Canada (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.93)

Industry:

Education (0.46)
Telecommunications (0.46)
Health & Medicine > Pharmaceuticals & Biotechnology (0.43)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Communications > Networks (0.93)
Information Technology > Data Science > Data Mining > Big Data (0.30)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.30)

Add feedback

Restless-UCB, an Efficient and Low-complexity Algorithm for Online Restless Bandits

Neural Information Processing SystemsAug-15-2025, 01:02:04 GMT

As a result, our analysis technique can also be adopted to tighten the regret bounds of existing algorithms.

bandit problem, restless bandit problem, restless-ucb, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.93)

Industry:

Education (0.46)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Communications (0.93)
Information Technology > Data Science > Data Mining > Big Data (0.32)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.32)

Add feedback

Non-Stationary Restless Multi-Armed Bandits with Provable Guarantee

Hung, Yu-Heng, Hsieh, Ping-Chun, Wang, Kai

arXiv.org Artificial IntelligenceAug-15-2025

Online restless multi-armed bandits (RMABs) typically assume that each arm follows a stationary Markov Decision Process (MDP) with fixed state transitions and rewards. However, in real-world applications like healthcare and recommendation systems, these assumptions often break due to non-stationary dynamics, posing significant challenges for traditional RMAB algorithms. In this work, we specifically consider $N$-armd RMAB with non-stationary transition constrained by bounded variation budgets $B$. Our proposed \rmab\; algorithm integrates sliding window reinforcement learning (RL) with an upper confidence bound (UCB) mechanism to simultaneously learn transition dynamics and their variations. We further establish that \rmab\; achieves $\widetilde{\mathcal{O}}(N^2 B^{\frac{1}{4}} T^{\frac{3}{4}})$ regret bound by leveraging a relaxed definition of regret, providing a foundational theoretical framework for non-stationary RMAB problems for the first time.

data mining, machine learning, reinforcement learning, (20 more...)

arXiv.org Artificial Intelligence

2508.10804

Genre: Research Report (0.40)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Nonlocal Monte Carlo via Reinforcement Learning

Dobrynin, Dmitrii, Mohseni, Masoud, Strachan, John Paul

arXiv.org Artificial IntelligenceAug-15-2025

Optimizing or sampling complex cost functions of combinatorial optimization problems is a longstanding challenge across disciplines and applications. When employing family of conventional algorithms based on Markov Chain Monte Carlo (MCMC) such as simulated annealing or parallel tempering, one assumes homogeneous (equilibrium) temperature profiles across input. This instance independent approach was shown to be ineffective for the hardest benchmarks near a computational phase transition when the so-called overlap-gap-property holds. In these regimes conventional MCMC struggles to unfreeze rigid variables, escape suboptimal basins of attraction, and sample high-quality and diverse solutions. In order to mitigate these challenges, Nonequilibrium Nonlocal Monte Carlo (NMC) algorithms were proposed that leverage inhomogeneous temperature profiles thereby accelerating exploration of the configuration space without compromising its exploitation. Here, we employ deep reinforcement learning (RL) to train the nonlocal transition policies of NMC which were previously designed phenomenologically. We demonstrate that the resulting solver can be trained solely by observing energy changes of the configuration space exploration as RL rewards and the local minimum energy landscape geometry as RL states. We further show that the trained policies improve upon the standard MCMC-based and nonlocal simulated annealing on hard uniform random and scale-free random 4-SAT benchmarks in terms of residual energy, time-to-solution, and diversity of solutions metrics.

machine learning, optimization, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2508.1052

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Education (0.67)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

Learning State-Space Models of Dynamic Systems from Arbitrary Data using Joint Embedding Predictive Architectures

Ulmen, Jonas, Sundaram, Ganesh, Görges, Daniel

arXiv.org Artificial IntelligenceAug-15-2025

Abstract: With the advent of Joint Embedding Predictive Architectures (JEPAs), which appear to be more capable than reconstruction-based methods, this paper introduces a novel technique for creating world models using continuous-time dynamic systems from arbitrary observation data. The proposed method integrates sequence embeddings with neural ordinary differential equations (neural ODEs). It employs loss functions that enforce contractive embeddings and Lipschitz constants in state transitions to construct a well-organized latent state space. The approach's effectiveness is demonstrated through the generation of structured latent state-space models for a simple pendulum system using only image data. This opens up a new technique for developing more general control algorithms and estimation techniques with broad applications in robotics.

artificial intelligence, machine learning, sequence, (10 more...)

arXiv.org Artificial Intelligence

2508.10489

Country: Europe > Germany (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology: