AITopics | reward-free rl

OnReward-FreeReinforcementLearningwith LinearFunctionApproximation

Neural Information Processing SystemsFeb-10-2026, 11:12:45 GMT

During the exploration phase, an agent collects samples without using a pre-specified reward function. After the exploration phase, a reward function is given, and the agent uses samples collected during the exploration phase to computeanear-optimalpolicy.

artificial intelligence, exploration phase, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > United Kingdom (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

OnReward-FreeReinforcementLearningwith LinearFunctionApproximation

Neural Information Processing SystemsFeb-10-2026, 11:12:37 GMT

During the exploration phase, an agent collects samples without using a pre-specified reward function. After the exploration phase, a reward function is given, and the agent uses samples collected during the exploration phase to computeanear-optimalpolicy.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.31)

Add feedback

8433bb4f7477bf8202614ce1ae8b1169-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 09:42:55 GMT

assumption, international conference, rfo live, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > United States > Michigan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Add feedback

On Reward-Free Reinforcement Learning with Linear Function Approximation

Neural Information Processing SystemsDec-24-2025, 15:42:20 GMT

Reward-free reinforcement learning (RL) is a framework which is suitable for both the batch RL setting and the setting where there are many reward functions of interest. During the exploration phase, an agent collects samples without using a pre-specified reward function. After the exploration phase, a reward function is given, and the agent uses samples collected during the exploration phase to compute a near-optimal policy. Jin et al. [2020] showed that in the tabular setting, the agent only needs to collect polynomial number of samples (in terms of the number states, the number of actions, and the planning horizon) for reward-free RL. However, in practice, the number of states and actions can be large, and thus function approximation schemes are required for generalization.

linear function approximation, reward-free reinforcement learning, reward-free rl, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.59)

Add feedback

On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL

Neural Information Processing SystemsAug-16-2025, 14:01:43 GMT

Our analyses indicate that the explorability or reachability assumptions, previously made for the latter two settings, are not necessary statistically for reward-free exploration.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > United States > Michigan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Add feedback

On Reward-Free Reinforcement Learning with Linear Function Approximation

Neural Information Processing SystemsAug-16-2025, 13:18:40 GMT

During the exploration phase, an agent collects samples without using a pre-specified reward function.

algorithm, exploration phase, planning phase, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.41)

Add feedback

On Reward-Free Reinforcement Learning with Linear Function Approximation

Neural Information Processing SystemsAug-16-2025, 13:18:32 GMT

During the exploration phase, an agent collects samples without using a pre-specified reward function.

algorithm, exploration phase, reward function, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.42)

Add feedback

On Reward-Free Reinforcement Learning with Linear Function Approximation

Neural Information Processing SystemsMay-27-2025, 12:16:33 GMT

Reward-free reinforcement learning (RL) is a framework which is suitable for both the batch RL setting and the setting where there are many reward functions of interest. During the exploration phase, an agent collects samples without using a pre-specified reward function. After the exploration phase, a reward function is given, and the agent uses samples collected during the exploration phase to compute a near-optimal policy. Jin et al. [2020] showed that in the tabular setting, the agent only needs to collect polynomial number of samples (in terms of the number states, the number of actions, and the planning horizon) for reward-free RL. However, in practice, the number of states and actions can be large, and thus function approximation schemes are required for generalization.

artificial intelligence, machine learning, reinforcement learning, (11 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.64)

Add feedback

Necessary and Sufficient Oracles: Toward a Computational Taxonomy For Reinforcement Learning

Rohatgi, Dhruv, Foster, Dylan J.

arXiv.org Artificial IntelligenceFeb-12-2025

Algorithms for reinforcement learning (RL) in large state spaces crucially rely on supervised learning subroutines to estimate objects such as value functions or transition probabilities. Since only the simplest supervised learning problems can be solved provably and efficiently, practical performance of an RL algorithm depends on which of these supervised learning "oracles" it assumes access to (and how they are implemented). But which oracles are better or worse? Is there a minimal oracle? In this work, we clarify the impact of the choice of supervised learning oracle on the computational complexity of RL, as quantified by the oracle strength. First, for the task of reward-free exploration in Block MDPs in the standard episodic access model -- a ubiquitous setting for RL with function approximation -- we identify two-context regression as a minimal oracle, i.e. an oracle that is both necessary and sufficient (under a mild regularity assumption). Second, we identify one-context regression as a near-minimal oracle in the stronger reset access model, establishing a provable computational benefit of resets in the process. Third, we broaden our focus to Low-Rank MDPs, where we give cryptographic evidence that the analogous oracle from the Block MDP setting is insufficient.

machine learning, regression, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2502.08632

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report (0.63)
Workflow (0.46)

Industry: Education (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.45)

Add feedback

On Reward-Free Reinforcement Learning with Linear Function Approximation

Neural Information Processing SystemsOct-11-2024, 09:59:38 GMT

Reward-free reinforcement learning (RL) is a framework which is suitable for both the batch RL setting and the setting where there are many reward functions of interest. During the exploration phase, an agent collects samples without using a pre-specified reward function. After the exploration phase, a reward function is given, and the agent uses samples collected during the exploration phase to compute a near-optimal policy. Jin et al. [2020] showed that in the tabular setting, the agent only needs to collect polynomial number of samples (in terms of the number states, the number of actions, and the planning horizon) for reward-free RL. However, in practice, the number of states and actions can be large, and thus function approximation schemes are required for generalization.

linear function approximation, reward-free reinforcement learning, reward-free rl, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.64)

Add feedback

Collaborating Authors

reward-free rl

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

OnReward-FreeReinforcementLearningwith LinearFunctionApproximation

OnReward-FreeReinforcementLearningwith LinearFunctionApproximation

8433bb4f7477bf8202614ce1ae8b1169-Paper-Conference.pdf

On Reward-Free Reinforcement Learning with Linear Function Approximation

On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL

On Reward-Free Reinforcement Learning with Linear Function Approximation

On Reward-Free Reinforcement Learning with Linear Function Approximation

On Reward-Free Reinforcement Learning with Linear Function Approximation

Necessary and Sufficient Oracles: Toward a Computational Taxonomy For Reinforcement Learning

On Reward-Free Reinforcement Learning with Linear Function Approximation