Reward-Free Model-Based Reinforcement Learning with Linear Function Approximation

Apr-24-2026, 15:50:13 GMT–Neural Information Processing Systems

We study the model-based reward-free reinforcement learning with linear function approximation for episodic Markov decision processes (MDPs). In this setting, the agent works in two phases. In the exploration phase, the agent interacts with the environment and collects samples without the reward. In the planning phase, the agent is given a specific reward function and uses samples collected from the exploration phase to learn a good policy. We propose a new provably efficient algorithm, called UCRL-RFE under the Linear Mixture MDP assumption, where the transition probability kernel of the MDP can be parameterized by a linear function over certain feature mappings defined on the triplet of state, action, and next state.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Apr-24-2026, 15:50:13 GMT

Conferences PDF

Add feedback

Country:
- North America > United States > California > Los Angeles County > Los Angeles (0.29)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Uncertainty
    - Fuzzy Logic (0.62)
  - Machine Learning
    - Reinforcement Learning (0.88)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.48)

Duplicate Docs Excel Report

Title
0cb929eae7a499e50248a3a78f7acfc7-Paper.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found