Spectral Learning for Infinite-Horizon Average-Reward POMDPs

Jun-17-2026, 02:25:21 GMT–Neural Information Processing Systems

We address the learning problem in the context of infinite-horizon average-reward POMDPs. Traditionally, this problem has been approached using Spectral Decomposition (SD) methods applied to samples collected under non-adaptive policies, such as uniform or round-robin policies. Recently, SD techniques have been extended to accommodate a restricted class of adaptive policies such as memoryless policies. However, the use of adaptive policies has introduced challenges related to data inefficiency, as SD methods typically require all samples to be drawn from a single policy. In this work, we propose Mixed Spectral Estimation, which generalizes spectral estimation techniques to support a broader class of belief-based policies.

artificial intelligence, lq 3, machine learning, (19 more...)

Neural Information Processing Systems

Jun-17-2026, 02:25:21 GMT

Conferences PDF

Add feedback

Country:
- North America > United States
  - New York (0.28)
- Europe > United Kingdom
  - England (0.27)

Genre:
- Research Report > Experimental Study (1.00)
- Workflow (0.67)

Industry:
- Education (0.48)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found