Attention Trajectories as a Diagnostic Axis for Deep Reinforcement Learning
Beylier, Charlotte, Selder, Hannah, Fleig, Arthur, Hofmann, Simon M., Scherf, Nico
–arXiv.org Artificial Intelligence
While deep reinforcement learning agents demonstrate high performance across domains, their internal decision processes remain difficult to interp ret when evaluated only through performance metrics. In particular, it is poorly understoo d which input features agents rely on, how these dependencies evolve during training, and how t hey relate to behavior. We introduce a scientific methodology for analyzing the learni ng process through quantitative analysis of saliency. This approach aggregates saliency in formation at the object and modality level into hierarchical attention profiles, quantifyin g how agents allocate attention over time, thereby forming attention trajectories throughout t raining. Applied to Atari benchmarks, custom Pong environments, and muscle-actuated biom echanical user simulations in visuomotor interactive tasks, this methodology uncovers a lgorithm-specific attention biases, reveals unintended reward-driven strategies, and diagnos es overfitting to redundant sensory channels. These patterns correspond to measurable behavio ral differences, demonstrating empirical links between attention profiles, learning dynam ics, and agent behavior. To assess robustness of the attention profiles, we validate our finding s across multiple saliency methods and environments. The results establish attention traj ectories as a promising diagnostic axis for tracing how feature reliance develops during train ing and for identifying biases and vulnerabilities invisible to performance metrics alone.
arXiv.org Artificial Intelligence
Dec-1-2025
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe
- North America > United States
- New Jersey > Middlesex County
- Piscataway (0.04)
- New York > New York County
- New York City (0.04)
- New Jersey > Middlesex County
- Asia > Middle East
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Health & Medicine (1.00)
- Leisure & Entertainment > Games (0.94)
- Technology: