171ae1bbb81475eb96287dd78565b38b-AuthorFeedback.pdf
–Neural Information Processing Systems
We7 observe empirically that doubling ofnrequires doubling ofm, to get policies of a similar quality. Feedback 2: Theorem 4isan instance-dependent upper bound on then-round regret ofSoftElim.
Neural Information Processing Systems
Feb-7-2026, 14:39:00 GMT