Reviews: EX2: Exploration with Exemplar Models for Deep Reinforcement Learning

Oct-7-2024, 14:56:08 GMT–Neural Information Processing Systems

Review of submission 1489: EX2: Exploration with Exemplar Models for Deep Reinforcement Learning Summary: A discriminative novelty detection algorithm is proposed to improve exploration for policy gradient based reinforcement learning algorithms. The implicitly-estimated density by the discriminative novelty detection of a state is then used to produce a reward bonus added to the original reward for down-stream policy optimization algorithms (TRPO). Two techniques are discussed to improve the computation efficiency. Comments - One motivation of the paper is to utilize implicit density estimation to approximate classic count based exploration. The discriminative novelty detection only maintains a density estimation over the states, but not state-action pairs.

data mining, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Oct-7-2024, 14:56:08 GMT

Conferences Web Page

Add feedback

Industry:
- Energy > Oil & Gas > Upstream (0.41)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)