Goto

Collaborating Authors

 Statistical Learning


Revisiting the Integration of Convolution and Attention for Vision Backbone

Neural Information Processing Systems

Convolutions (Convs) and multi-head self-attentions (MHSAs) are typically considered alternatives to each other for building vision backbones. Although some works try to integrate both, they apply the two operators simultaneously at the finest pixel granularity.





7716d0fc31636914783865d34f6cdfd5-AuthorFeedback.pdf

Neural Information Processing Systems

This is becausea>t a takes a large amount of iterations to increase from negative to0.26 Consequently,withalargestepsize,wcanmovefarawayfromw beforea>t a becomesnonnegative. For problems with multiple global optima, our analysis can still be applied if the35 following condition holds: there exists one global optimum such that the PD condition holds globally with respect36 tothis optimum.


Online Estimation via Offline Estimation: An Information-Theoretic Framework Dylan J. Foster

Neural Information Processing Systems

The classical theory of statistical estimation aims to estimate a parameter of interest under data generated from a fixed design ("offline estimation"), while the contemporary theory of online learning provides algorithms for estimation under adaptively