AITopics | Markov Models

We study sequential decision-making problems in which each agent aims to maximize the expected total reward while satisfying a constraint on the expected total utility. We employ the natural policy gradient method to solve the discounted infinite-horizon Constrained Markov Decision Processes (CMDPs) problem. Specifically, we propose a new Natural Policy Gradient Primal-Dual (NPG-PD) method for CMDPs which updates the primal variable via natural policy gradient ascent and the dual variable via projected sub-gradient descent.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.14)
North America > United States > Illinois (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Industry:

Health & Medicine (0.93)
Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.61)

Add feedback

NAS-χ: Neural Adaptive Smoothing via Twisting

Neural Information Processing SystemsFeb-8-2026, 12:54:58 GMT

Work performed while at Stanford University.

artificial intelligence, machine learning, particle, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(2 more...)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
(2 more...)

Add feedback

Finite-TimeAnalysisofRound-Robin Kullback-LeiblerUpperConfidenceBoundsfor OptimalAdaptiveAllocationwithMultiplePlaysand MarkovianRewards

Neural Information Processing SystemsFeb-8-2026, 12:46:35 GMT

Forouranalysis wedevise several concentration results forMarkovchains, including amaximal inequality for Markov chains, that may be of interest in their own right. As a byproduct of our analysis we also establish asymptotically optimal, finite-time guarantees for the case of multiple plays, and i.i.d.

artificial intelligence, data mining, machine learning, (20 more...)

Neural Information Processing Systems

Country:

Europe > Hungary > Budapest > Budapest (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.36)
Information Technology > Data Science > Data Mining > Big Data (0.31)

Add feedback

58c54802a9fb9526cd0923353a34a7ae-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 12:35:25 GMT

ctfp, stochastic process, wiener process, (13 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.05)
North America > Canada (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
(2 more...)

Add feedback

Temporally Disentangled Representation Learning under Unknown Nonstationarity Xiangchen Song

Neural Information Processing SystemsFeb-8-2026, 10:57:07 GMT

However, in nonstationary setting, existing work only partially addressed the problem by either utilizing observed auxiliary variables (e.g., class labels and/or domain indexes) as side-information or assuming simplified latent causal dynamics. Both constrain the method to a limited range of scenarios.

artificial intelligence, machine learning, transition matrix, (16 more...)

Neural Information Processing Systems

Country: