Goto

Collaborating Authors

 disturbance










Making Non-StochasticControl(Almost)asEasyas Stochastic

Neural Information Processing Systems

We attain the optimal eO( T) regret when the dynamics are unknown to the learner, and poly(logT) regret when known, provided that the cost functions are strongly convex (as in LQR).