This is just a brief understanding to all excited learners wishing to know more about what we call the activation layer. A concept I took quite a long time to understand throughout my journey into Neural Networks (NN)( A neural network is a network or circuit of neurons, or in a modern sense, an artificial neural network, composed of artificial neurons or nodes.). The whole concept took birth from call feature scaling (is a method used to normalize the range of independent variables or features of data.). A machine learning technique usually used to fight against bias (Bias is a disproportionate weight in favor of or against an idea or thing, usually in a way that is closed-minded, prejudicial, or unfair.). Throughout this technique you will observe that a large-scale feature or feature whose distribution seems to outstand the other features with have an extremely big mean and variance, which tends to affect the computational result where a lot of emphasis is put on.

