Active Exploration in Dynamic Environments
Thrun, Sebastian B., Möller, Knut
–Neural Information Processing Systems
Many real-valued connectionist approaches to learning control realize exploration by randomness in action selection. This might be disadvantageous when costs are assigned to "negative experiences". The basic idea presented in this paper is to make an agent explore unknown regions in a more directed manner. This is achieved by a so-called competence map, which is trained to predict the controller's accuracy, and is used for guiding exploration. Based on this, a bistable system enables smoothly switching attention between two behaviors - exploration and exploitation - depending on expected costs and knowledge gain. The appropriateness of this method is demonstrated by a simple robot navigation task.
Neural Information Processing Systems
Dec-31-1992
- Country:
- Asia > Middle East
- Jordan (0.05)
- Europe
- Germany > North Rhine-Westphalia
- Cologne Region > Bonn (0.04)
- Upper Bavaria > Munich (0.05)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.14)
- Germany > North Rhine-Westphalia
- North America > United States
- California
- San Diego County > San Diego (0.04)
- San Mateo County > San Mateo (0.05)
- Colorado > Boulder County
- Boulder (0.04)
- District of Columbia > Washington (0.04)
- Illinois (0.04)
- Massachusetts (0.04)
- New York > Monroe County
- Rochester (0.04)
- Pennsylvania > Allegheny County
- Pittsburgh (0.14)
- California
- Asia > Middle East
- Technology: