DeepReinforcementLearningattheEdgeofthe StatisticalPrecipice