On the Impact of the Activation Function on Deep Neural Networks Training