Goto

Collaborating Authors

 deep net





Chirality Nets for Human Pose Regression

Neural Information Processing Systems

We propose Chirality Nets, a family of deep nets that is equivariant to the "chirality transform," i.e., the transformation to create a chiral pair. Through parameter sharing, odd and even symmetry, we propose and prove variants of standard building blocks of deep nets that satisfy the equivariance property, including fully connected layers, convolutional layers, batch-normalization, and LSTM/GRU cells. The proposed layers lead to a more data efficient representation and a reduction in computation by exploiting symmetry. We evaluate chirality nets on the task of human pose regression, which naturally exploits the left/right mirroring of the human body. We study three pose regression tasks: 3D pose estimation from video, 2D pose forecasting, and skeleton based activity recognition. Our approach achieves/matches state-of-the-art results, with more significant gains on small datasets and limited-data settings.


On the Validity of Modeling SGD with Stochastic Differential Equations (SDEs)

Neural Information Processing Systems

It is generally recognized that finite learning rate (LR), in contrast to infinitesimal LR, is important for good generalization in real-life deep nets. Most attempted explanations propose approximating finite-LR SGD with Itô Stochastic Differential Equations (SDEs), but formal justification for this approximation (e.g., Li et al., 2019) only applies to SGD with tiny LR. Experimental verification of the approximation appears computationally infeasible. The current paper clarifies the picture with the following contributions: (a) An efficient simulation algorithm SVAG that provably converges to the conventionally used Itô SDE approximation.



8493eeaccb772c0878f99d60a0bd2bb3-AuthorFeedback.pdf

Neural Information Processing Systems

We thank all the reviewers for carefully checking the paper and acknowledging the "efficiency and practicality" of We will also clarify in the revised version. R1 asks for discussion of similarity and difference of technical results to [19]. Hence, training on the medoids is robust to noisy labels. Indeed, Eq. 5 finds the best subset of Fraction of clean data points in coreset. R2 asks how clean the coreset is.