LearningFrequencyDomainApproximationfor BinaryNeuralNetworks

Neural Information Processing Systems 

Since the gradient ofthe conventional sign function is almost zero everywhere which cannot be used for back-propagation, several attempts have been proposed to alleviate the optimization difficulty by using approximate gradient.