[P] E-swish: A New Activation Function • r/MachineLearning
Hi, this is a nice first paper. How does one interpret figure 7? Test error of 99%? You stated "Our experiments show that E-swish systematically outperforms any other well-known activation function" however you only really compare to Relu. You need to compare it with other options that people have published about. What is the added computational speed of each model because of the more complex form, compared to other activation functions?
Jan-29-2018, 21:54:37 GMT
- Technology: