A reinterpretation of the policy oscillation phenomenon in approximate policy iteration

Open in new window