AITopics | Markov Models

Collaborating Authors

Markov Models

News Overviews Instructional Materials AI-Alerts Classics

Predictive Attractor Models

Neural Information Processing SystemsOct-10-2025, 03:58:56 GMT

The task of sequential memory is considered challenging for models operating under biological constraints (i.e., local synaptic computations) for many reasons, including catastrophic forgetting,

experiment, representation, sequence, (17 more...)

Neural Information Processing Systems

Country:

Africa > Mali (0.04)
North America > United States > Florida > Hillsborough County > Tampa (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.68)
Education (0.68)
Leisure & Entertainment (0.67)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(5 more...)

Add feedback

Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Neural Information Processing SystemsOct-10-2025, 03:51:45 GMT

While humans accomplish 72.4% of the tasks, the best

agent, arxiv preprint arxiv, pyautogui, (14 more...)

Neural Information Processing Systems

Country:

Asia > China > Hong Kong (0.04)
Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)

Genre: Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Law (1.00)
Education > Educational Setting > Online (0.93)
Information Technology > Software (0.69)
Education > Educational Technology (0.67)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(11 more...)

Add feedback

Predicting Future Actions of Reinforcement Learning Agents

Neural Information Processing SystemsOct-10-2025, 03:41:28 GMT

Furthermore, using internal plans proves more robust to model quality compared to simulation-based approaches when predicting actions, while the results for event prediction are more mixed.

agent, prediction, world model, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Transportation (0.46)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Efficient Recurrent Off-Policy RL Requires a Context-Encoder-Specific Learning Rate Fan-Ming Luo 1,2 Zuolin Tu

Neural Information Processing SystemsOct-10-2025, 03:06:44 GMT

Recent progress has demonstrated that recurrent reinforcement learning (RL), which consists of a context encoder based on recurrent neural networks (RNNs) for unobservable state prediction and a multilayer perceptron (MLP) policy for decision making, can mitigate partial observability and serve as a robust baseline for POMDP tasks.

context encoder, international conference, resel, (15 more...)

Neural Information Processing Systems

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Oceania > Australia > Queensland > Brisbane (0.04)
(17 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)

Add feedback

Grounded Answers for Multi-agent Decision-making Problem through Generative World Model

Neural Information Processing SystemsOct-10-2025, 02:37:42 GMT

Figure 1: Complex decision problems that require a good understanding of the environment's

international conference, learning, reinforcement learning, (9 more...)

Neural Information Processing Systems

Country:

Asia > China > Shaanxi Province > Xi'an (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology (1.00)
Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Gated Inference Network: Inference and Learning State-Space Models

Neural Information Processing SystemsOct-10-2025, 00:51:38 GMT

This paper advances temporal reasoning within dynamically changing high-dimensional noisy observations, focusing on a latent space that characterizes the nonlinear dynamics of objects in their environment.

experiment, gin, gin filter, (13 more...)

Neural Information Processing Systems

Country:

Asia > South Korea > Seoul > Seoul (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)

Add feedback

3ec4f1d5759d6d3d340a66638a52944b-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 00:13:00 GMT

adversary, algorithm, training curve, (15 more...)

Neural Information Processing Systems

Country:

Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
Europe > Portugal > Braga > Braga (0.04)
Europe > France > Île-de-France (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Wavefunction Flows: Efficient Quantum Simulation of Continuous Flow Models

Layden, David, Sweke, Ryan, Havlíček, Vojtěch, Chowdhury, Anirban, Neklyudov, Kirill

arXiv.org Machine LearningOct-10-2025

Flow models are a cornerstone of modern machine learning. They are generative models that progressively transform probability distributions according to learned dynamics. Specifically, they learn a continuous-time Markov process that efficiently maps samples from a simple source distribution into samples from a complex target distribution. We show that these models are naturally related to the Schrödinger equation, for an unusual Hamiltonian on continuous variables. Moreover, we prove that the dynamics generated by this Hamiltonian can be efficiently simulated on a quantum computer. Together, these results give a quantum algorithm for preparing coherent encodings (a.k.a., qsamples) for a vast family of probability distributions--namely, those expressible by flow models--by reducing the task to an existing classical learning problem, plus Hamiltonian simulation. For statistical problems defined by flow models, such as mean estimation and property testing, this enables the use of quantum algorithms tailored to qsamples, which may offer advantages over classical algorithms based only on samples from a flow model. More broadly, these results reveal a close connection between state-of-the-art machine learning models, such as flow matching and diffusion models, and one of the main expected capabilities of quantum computers: simulating quantum dynamics.

algorithm, equation, flow model, (14 more...)

arXiv.org Machine Learning

2510.08462

Country:

Africa > South Africa (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Quebec (0.04)
(5 more...)

Genre:

Research Report (0.63)
Overview (0.45)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning

Wei, Zhepei, Yao, Wenlin, Liu, Yao, Zhang, Weizhi, Lu, Qin, Qiu, Liang, Yu, Changlong, Xu, Puyang, Zhang, Chao, Yin, Bing, Yun, Hyokun, Li, Lihong

arXiv.org Artificial IntelligenceOct-10-2025

While reinforcement learning (RL) has demonstrated remarkable success in enhancing large language models (LLMs), it has primarily focused on single-turn tasks such as solving math problems. Training effective web agents for multi-turn interactions remains challenging due to the complexity of long-horizon decision-making across dynamic web interfaces. In this work, we present WebAgent-R1, a simple yet effective end-to-end multi-turn RL framework for training web agents. It learns directly from online interactions with web environments by asynchronously generating diverse trajectories, entirely guided by binary rewards depending on task success. Experiments on the WebArena-Lite benchmark demonstrate the effectiveness of WebAgent-R1, boosting the task success rate of Qwen-2.5-3B from 6.1% to 33.9% and Llama-3.1-8B from 8.5% to 44.8%, significantly outperforming existing state-of-the-art methods and strong proprietary models such as OpenAI o3. In-depth analyses reveal the effectiveness of the thinking-based prompting strategy and test-time scaling through increased interactions for web tasks. We further investigate different RL initialization policies by introducing two variants, namely WebAgent-R1-Zero and WebAgent-R1-CoT, which highlight the importance of the warm-up training stage (i.e., behavior cloning) and provide insights on incorporating long chain-of-thought (CoT) reasoning in web agents.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2505.16421

Genre: Research Report > Promising Solution (0.48)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

DexNDM: Closing the Reality Gap for Dexterous In-Hand Rotation via Joint-Wise Neural Dynamics Model

Liu, Xueyi, Wang, He, Yi, Li

arXiv.org Artificial IntelligenceOct-10-2025

Achieving generalized in-hand object rotation remains a significant challenge in robotics, largely due to the difficulty of transferring policies from simulation to the real world. The complex, contact-rich dynamics of dexterous manipulation create a "reality gap" that has limited prior work to constrained scenarios involving simple geometries, limited object sizes and aspect ratios, constrained wrist poses, or customized hands. We address this sim-to-real challenge with a novel framework that enables a single policy, trained in simulation, to generalize to a wide variety of objects and conditions in the real world. The core of our method is a joint-wise dynamics model that learns to bridge the reality gap by effectively fitting limited amount of real-world collected data and then adapting the sim policy's actions accordingly. The model is highly data-efficient and generalizable across different whole-hand interaction distributions by factorizing dynamics across joints, compressing system-wide influences into low-dimensional variables, and learning each joint's evolution from its own dynamic profile, implicitly capturing these net effects. We pair this with a fully autonomous data collection strategy that gathers diverse, real-world interaction data with minimal human intervention. Our complete pipeline demonstrates unprecedented generality: a single policy successfully rotates challenging objects with complex shapes (e.g., animals), high aspect ratios (up to 5.33), and small sizes, all while handling diverse wrist orientations and rotation axes. Comprehensive real-world evaluations and a teleoperation application for complex tasks validate the effectiveness and robustness of our approach. Website: https://meowuu7.github.io/DexNDM/

artificial intelligence, machine learning, trajectory, (15 more...)

arXiv.org Artificial Intelligence

2510.08556

Genre: Research Report > New Finding (0.92)

Industry: Leisure & Entertainment (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots > Manipulation (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.45)

Add feedback