AITopics | submdp

Collaborating Authors

submdp

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

4a5cfa9281924139db466a8a19291aff-Supplemental.pdf

Neural Information Processing SystemsNov-13-2025, 23:38:37 GMT

artificial intelligence, nullnull, optimal policy, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.47)

Add feedback

On Efficiency in Hierarchical Reinforcement Learning

Neural Information Processing SystemsNov-13-2025, 23:38:29 GMT

Aspects (2) and (3) have received empirical validation in a large number of papers, but not much theoretical analysis, except for some special cases, such as the work of Mann et al.

exit profile, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: North America > Canada (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

that (1) there is a bijection between state spaces and (2) through which the subMDPs have the same transition/reward

Neural Information Processing SystemsNov-13-2025, 23:38:18 GMT

We thank all reviewers for spending their valuable time reviewing our paper. We now answer some specific question in detail. The definition of "equivalent subMDPs" (Definition 2) requires As discussed in the paper, (2) can be relaxed to similar transition/reward models. For the statistical efficiency results, this assumption could be relaxed, e.g. if a However, it is beyond the scope of this paper and we aim to address it in future work. We will add a more explicit discussion about the comparison to Mann et al. (2015) Theorem 1 in this paper is partially motivated by Osband et al. (2013); however, we consider a very different setting and Specifically, (1) Theorem 1 considers hierarchical structure while Osband et al.

artificial intelligence, assumption, submdp, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.54)

Add feedback

The justification relies on the Bayesian regret bounds to show that hierarchical decomposition can lead to statistically efficient learning by comparing the bounds for the "flat" MDP to the decomposed MDP, and thus deriving the following conditions for beneficial decomposition: either the subMDPs must all have a small state space or the original MDP is able to be decomposed into a small number of equivalent MDPs. The paper then goes on to discuss the computational complexity of planning with hierarchical structures with the Planning with Exit Profiles (PEP) algorithm. The authors derive a bound on the computational complexity of the PEP algorithm, which leads to the following properties being required for efficient learning: all subMDPs must be small, with a small number of exit profiles and total exit states. Finally, the paper also presents a bound on the performance of the PEP algorithm, and discusses the conditions for finding high-quality exit profiles, which is a requirement for the near-optimal performance of PEP.

artificial intelligence, hierarchical reinforcement learning, machine learning, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.45)

Add feedback

Review for NeurIPS paper: On Efficiency in Hierarchical Reinforcement Learning

Neural Information Processing SystemsJan-24-2025, 03:34:24 GMT

Quoting from the reviewers: R1: The paper presents a novel framework for analyzing potential efficiencies in reinforcement learning due to hierarchical structure in MDPs. This framework formally defines several useful concepts (subMDPs, equivalent subMDPs, exit states and exit profiles) that allow for an elegant refinement of regret bounds in a well-defined regime. The identification of particular properties (subMDPs, exit state set, and equivalence of subMDPs) provides a clear and useful framework for theoretical analysis of hierarchical reinforcement learning. Overall this paper provides an elegant, concrete framework for formalizing hierarchical structure and quantifying the efficiency such structure may allow. The paper provides a theoretical analysis of hierarchical reinforcement learning, deriving results on learning and planning efficiency when the reinforcement learning problem has repeated structure.

artificial intelligence, hierarchical reinforcement learning, machine learning, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Demystifying Linear MDPs and Novel Dynamics Aggregation Framework

Lee, Joongkyu, Oh, Min-hwan

arXiv.org Machine LearningOct-31-2024

In this work, we prove that, in linear MDPs, the feature dimension $d$ is lower bounded by $S/U$ in order to aptly represent transition probabilities, where $S$ is the size of the state space and $U$ is the maximum size of directly reachable states. Hence, $d$ can still scale with $S$ depending on the direct reachability of the environment. To address this limitation of linear MDPs, we propose a novel structural aggregation framework based on dynamics, named as the "dynamics aggregation". For this newly proposed framework, we design a provably efficient hierarchical reinforcement learning algorithm in linear function approximation that leverages aggregated sub-structures. Our proposed algorithm exhibits statistical efficiency, achieving a regret of $ \tilde{O} ( d_{\psi}^{3/2} H^{3/2}\sqrt{ N T} )$, where $d_{\psi}$ represents the feature dimension of aggregated subMDPs and $N$ signifies the number of aggregated subMDPs. We establish that the condition $d_{\psi}^3 N \ll d^{3}$ is readily met in most real-world environments with hierarchical structures, enabling a substantial improvement in the regret bound compared to LSVI-UCB, which enjoys a regret of $ \tilde{O} (d^{3/2} H^{3/2} \sqrt{ T})$. To the best of our knowledge, this work presents the first HRL algorithm with linear function approximation that offers provable guarantees.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

2410.24089

Country: