A Greedy Approach to Adapting the Trace Parameter for Temporal Difference Learning

Open in new window