Deep Hedging Under Non-Convexity: Limitations and a Case for AlphaZero