Review for NeurIPS paper: Rational neural networks

Neural Information Processing Systems 

Additional Feedback: This work proposes a new activation function to sever deep learning architecture, providing a theoretical study about its complexity. This paper is well-written and provides a high-level of readability to most readers of the data mining community. However, the article would be significantly enhanced if the issues related to their motivation, technical analysis, and experiments are addressed. Detailed comments are given in the following: 1) Motivation – This paper proposes rational activation function as an alternative to ReLU, potentially avoiding the issue of vanishing gradient problem * The problem raised in this paper, i.e., some existing activation functions (e.g., sigmoid, logistic) can only handle the smooth signal, is a significant problem in deep neural network optimization since their derivative are zero for large value. Low-degree can save time, but is there any better configuration and why choose such type?