Goto

Collaborating Authors

 Planning & Scheduling




ExplainableReinforcementLearningviaModel Transforms

Neural Information Processing Systems

Understanding emerging behaviors of reinforcement learning (RL) agents may be difficult since such agents are often trained in complex environments using highly complex decision making procedures.








Optimize Planning Heuristics to Rank, not to Estimate Cost-to-Goal

Neural Information Processing Systems

Figure 1: Problem instance where perfect heuristic is not strictly optimally efficient with GBFS. However, the path (A, C,D, E) has cost 10 instead of 11 . Then h is a perfect ranking for GBFS on Γ. Proof. We carry the proof by induction with respect to the number of expanded states. Let's now make the induction step and assume the theorem holds for the first A 0 B 1 C 1 D 2 A 1 1 9 9 1 Figure 2: Problem instance where optimally efficient heuristic does not exists for GBFS.