AITopics | fmdp

Collaborating Authors

fmdp

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

5c936263f3428a40227908d5a3847c0b-Paper.pdf

Neural Information Processing SystemsApr-26-2026, 02:38:17 GMT

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America (0.46)
Europe (0.28)
Asia > Middle East > Israel (0.14)

Genre: Workflow (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

d3b1fb02964aa64e257f9f26a31f72cf-Paper.pdf

Neural Information Processing SystemsFeb-19-2026, 07:24:22 GMT

algorithm, fmdp, mdp, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Michigan (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
North America > Canada (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes Yi Tian

Neural Information Processing SystemsFeb-10-2026, 21:32:26 GMT

We study minimax optimal reinforcement learning in episodic factored Markov decision processes (FMDPs), which are MDPs with conditionally independent transition components.

factored structure, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.85)

Add feedback

Oracle-EfficientRegretMinimizationinFactored MDPswithUnknownStructure

Neural Information Processing SystemsFeb-8-2026, 21:29:18 GMT

The state of an FMDP is composed ofdcomponents, calledfactors, and each component is determined by only motherfactors, called itsscope.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes

Neural Information Processing SystemsDec-24-2025, 19:52:11 GMT

We study minimax optimal reinforcement learning in episodic factored Markov decision processes (FMDPs), which are MDPs with conditionally independent transition components. Assuming the factorization is known, we propose two model-based algorithms. The first one achieves minimax optimal regret guarantees for a rich class of factored structures, while the second one enjoys better computational complexity with a slightly worse regret. A key new ingredient of our algorithms is the design of a bonus term to guide exploration. We complement our algorithms by presenting several structure dependent lower bounds on regret for FMDPs that reveal the difficulty hiding in the intricacy of the structures.

factored markov decision process, minimax optimal reinforcement learning, name change, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.33)

Add feedback

Reinforcement Learning in Factored MDPs: Oracle-Efficient Algorithms and Tighter Regret Bounds for the Non-Episodic Setting

Neural Information Processing SystemsDec-24-2025, 16:41:46 GMT

We study reinforcement learning in non-episodic factored Markov decision processes (FMDPs). We propose two near-optimal and oracle-efficient algorithms for FMDPs. Assuming oracle access to an FMDP planner, they enjoy a Bayesian and a frequentist regret bound respectively, both of which reduce to the near-optimal bound $O(DS\sqrt{AT})$ for standard non-factored MDPs. We propose a tighter connectivity measure, factored span, for FMDPs and prove a lower bound that depends on the factored span rather than the diameter $D$. In order to decrease the gap between lower and upper bounds, we propose an adaptation of the REGAL.C algorithm whose regret bound depends on the factored span. Our oracle-efficient algorithms outperform previously proposed near-optimal algorithms on computer network administration simulations.

algorithm and tighter regret bound, oracle-efficient algorithm, reinforcement learning, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.32)

Add feedback

Oracle-Efficient Regret Minimization in Factored MDPs with Unknown Structure

Neural Information Processing SystemsDec-24-2025, 04:22:24 GMT

We study regret minimization in non-episodic factored Markov decision processes (FMDPs), where all existing algorithms make the strong assumption that the factored structure of the FMDP is known to the learner in advance. In this paper, we provide the first algorithm that learns the structure of the FMDP while minimizing the regret. Our algorithm is based on the optimism in face of uncertainty principle, combined with a simple statistical method for structure learning, and can be implemented efficiently given oracle-access to an FMDP planner. Moreover, we give a variant of our algorithm that remains efficient even when the oracle is limited to non-factored actions, which is the case with almost all existing approximate planners. Finally, we leverage our techniques to prove a novel lower bound for the known structure case, closing the gap to the regret bound of Chen et al. [2021].

factored mdp, name change, oracle-efficient regret minimization, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.61)

Add feedback

Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes

Neural Information Processing SystemsAug-17-2025, 01:18:58 GMT

We study minimax optimal reinforcement learning in episodic factored Markov decision processes (FMDPs), which are MDPs with conditionally independent transition components.

algorithm, factored structure, fmdp, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.85)

Add feedback

e61eaa38aed621dd776d0e67cfeee366-AuthorFeedback.pdf

Neural Information Processing SystemsAug-17-2025, 01:18:47 GMT

exploration, factorization, transition component, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.31)

Add feedback

A Proof of Theorem 1 and 2

Neural Information Processing SystemsAug-16-2025, 14:51:50 GMT

Before proving the theorems, we first introduce the confidence set for both transition probability and reward functions. We follow the standard regret analysis framework by Jaksch et al. (2010). Using Hoeffding's inequality, the regret caused by (6) can be upper bounded by For PSRL, since we use fixed episodes, we follow the techniques from Osband et al. (2013) For DORL, we need to prove optimism, i.e, We follow the proof in Agrawal and Jia (2017). We fix some s S and let x = ( s,π (s)) X .

artificial intelligence, mdp, probability, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.34)

Add feedback