A Additional numerical experiments

Neural Information Processing Systems 

In this section, we introduce some additional numerical experiments. To add some randomness of the environment, we set that the states transit randomly. The optimal policy encourages the agent to take the special jump and reach the terminal state. In the target policy, the agent will reach the terminal state as soon as possible but avoid to take the special jump. We assume that the agent does not know the attacker's manipulations and the presence of the attacker.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found