Reinforcement Learning with Action-Triggered Observations

Oct-3-2025–arXiv.org Machine Learning

We study reinforcement learning problems where state observations are stochastically triggered by actions, a constraint common in many real-world applications. This framework is formulated as Action-Triggered Sporadically Traceable Markov Decision Processes (ATST-MDPs), where each action has a specified probability of triggering a state observation. We derive tailored Bellman optimality equations for this framework and introduce the action-sequence learning paradigm in which agents commit to executing a sequence of actions until the next observation arrives. Under the linear MDP assumption, value-functions are shown to admit linear representations in an induced action-sequence feature map. Leveraging this structure, we propose off-policy estimators with statistical error guarantees for such feature maps and introduce ST-LSVI-UCB, a variant of LSVI-UCB adapted for action-triggered settings. ST-LSVI-UCB achieves regret $\widetilde O(\sqrt{Kd^3(1-γ)^{-3}})$, where $K$ is the number of episodes, $d$ the feature dimension, and $γ$ the discount factor (per-step episode non-termination probability). Crucially, this work establishes the theoretical foundation for learning with sporadic, action-triggered observations while demonstrating that efficient learning remains feasible under such observation constraints.

probability, proof, reinforcement learning, (16 more...)

arXiv.org Machine Learning

Oct-3-2025

arXiv.org PDF

Add feedback

Country:
- North America
  - United States (0.14)
  - Canada > Ontario
    - Toronto (0.14)
- Europe > France
  - Hauts-de-France > Nord > Lille (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - China (0.04)

Genre:
- Research Report (0.81)
- Workflow (0.55)

Industry:
- Health & Medicine (0.74)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Agents (0.92)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.88)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found