Robust, Efficient, Globally-Optimized Reinforcement Learning with the Parti-Game Algorithm
Al-Ansari, Mohammad A., Williams, Ronald J.
–Neural Information Processing Systems
The former represents the number of cells that have to be traveled through to get to the goal cell and the latter represents the belief that there is no reliable way of getting from that cell to the goal. Cells with a cost of infinity are called losing cells while others are called winning ones.
Neural Information Processing Systems
Dec-31-1999