Naman Agarwal
Logarithmic Regret for Online Control
Naman Agarwal, Elad Hazan, Karan Singh
We study optimal regret bounds for control in linear dynamical systems under adversarially changing strongly convex cost functions, given the knowledge of transition dynamics. This includes several well studied and fundamental frameworks such as the Kalman filter and the linear quadratic regulator. State of the art methods achieve regret which scales as O( T), where T is the time horizon. We show that the optimal regret in this setting can be significantly smaller, scaling as O(poly(log T)). This regret bound is achieved by two different efficient iterative methods, online gradient descent and online natural gradient.
cpSGD: Communication-efficient and differentially-private distributed SGD
Naman Agarwal, Ananda Theertha Suresh, Felix Xinnan X. Yu, Sanjiv Kumar, Brendan McMahan
Distributed stochastic gradient descent is an important subroutine in distributed learning. A setting of particular interest is when the clients are mobile devices, where two important concerns are communication efficiency and the privacy of the clients. Several recent works have focused on reducing the communication cost or introducing privacy guarantees, but none of the proposed communication efficient methods are known to be privacy preserving and none of the known privacy mechanisms are known to be communication efficient. To this end, we study algorithms that achieve both communication efficiency and differential privacy. For d variables and n d clients, the proposed method uses O(log log(nd)) bits of communication per client per coordinate and ensures constant privacy. We also improve previous analysis of the Binomial mechanism showing that it achieves nearly the same utility as the Gaussian mechanism, while requiring fewer representation bits, which can be of independent interest.
Logarithmic Regret for Online Control
Naman Agarwal, Elad Hazan, Karan Singh
We study optimal regret bounds for control in linear dynamical systems under adversarially changing strongly convex cost functions, given the knowledge of transition dynamics. This includes several well studied and fundamental frameworks such as the Kalman filter and the linear quadratic regulator. State of the art methods achieve regret which scales as O( T), where T is the time horizon. We show that the optimal regret in this setting can be significantly smaller, scaling as O(poly(log T)). This regret bound is achieved by two different efficient iterative methods, online gradient descent and online natural gradient.