Reviews: Strategic Attentive Writer for Learning Macro-Actions

Neural Information Processing Systems 

Summary of Recommendation: The paper introduces an original idea. Committing to a plan has been introduced before in RL, e.g., in Sutton's options literature (where no learning occurs), and Schmidhuber's hierarchical RL systems of the early 1990s, and Wiering's HQ learning, but the new approach is different. However, the formalisation and experimental section seem to lack clarity and raise several questions. In particular, the experiments don't show very convincingly that the attentional mechanism is needed (although it seems like a very nice idea) and the actual behaviour of the attention is not explored at all. I don't see this as a fatal flaw, but this is definitely problematic since the title and main thrust of the paper rely on it.