Goto

Collaborating Authors

 smooth problem





abd1c782880cc59759f4112fda0b8f98-AuthorFeedback.pdf

Neural Information Processing Systems

We thank the reviewers for their feedback and time! We are encouraged they found our theoretical results "impressive" Large batchsizes help us to obtain complexity guarantees beating the state-of-the-art ones. We can add these details to the main body using an additional 9th page. We agree with this criticism. We will try to test our methods on this task and investigate the heavy-tailedness of stochastic gradients for this problem. Simsekli et al. focus on non-convex problems and rates of convergence in expectation .


Optimal Methods for Convex Risk Averse Distributed Optimization

Lan, Guanghui, Zhang, Zhe

arXiv.org Artificial Intelligence

This paper studies the communication complexity of convex risk-averse optimization over a network. The problem generalizes the well-studied risk-neutral finite-sum distributed optimization problem and its importance stems from the need to handle risk in an uncertain environment. For algorithms in the literature, there exists a gap in communication complexities for solving risk-averse and risk-neutral problems. We propose two distributed algorithms, namely the distributed risk averse optimization (DRAO) method and the distributed risk averse optimization with sliding (DRAO-S) method, to close the gap. Specifically, the DRAO method achieves the optimal communication complexity by assuming a certain saddle point subproblem can be easily solved in the server node. The DRAO-S method removes the strong assumption by introducing a novel saddle point sliding subroutine which only requires the projection over the ambiguity set $P$. We observe that the number of $P$-projections performed by DRAO-S is optimal. Moreover, we develop matching lower complexity bounds to show the communication complexities of both DRAO and DRAO-S to be improvable. Numerical experiments are conducted to demonstrate the encouraging empirical performance of the DRAO-S method.


A note on active learning for smooth problems

Mahalanabis, Satyaki

arXiv.org Machine Learning

We show that the disagreement coefficient of certain smooth hypothesis classes is $O(m)$, where $m$ is the dimension of the hypothesis space, thereby answering a question posed in \cite{friedman09}.