A Minimum-fuel cost (MF) derivation When c(a(t)) = |a (t) |

Neural Information Processing Systems 

The action cost differs by row, where the corresponding optimal policy parameterization is highlighted in green.