An Alternative Softmax Operator for Reinforcement Learning

Open in new window