Currently I'm trying to solve regression problem with deep neural network. I have some problems and I couldn't find appropriate answers from google so I'm here to ask your ideas. My data samples are in [0, 1]N. Because of this range, I had to use sigmoid(or tanh) as activation of output layer(Other layer's activations are all ReLU). However this cause serious problem which is gradient vanishing.

The Neural Network has been developed to mimic a human brain. Though we are not there yet, neural networks are very efficient in machine learning. It was popular in the 1980s and 1990s. Recently it has become more popular. Computers are fast enough to run a large neural network in a reasonable time.

I find it hard to get step by step and detailed explanations about Neural Networks in one place. Always some part of the explanation was missing in courses or in the videos. So I tried to gather all the information and explanations in one blog post (step by step). I would separate this blog in 8 sections as I find it most relevant. Artificial Neural Network is computing system inspired by biological neural network that constitute animal brain.

Now, it's clear that if we use a linear activation function (identity activation function), then the Neural Network will output linear output of the input. This loses much of the representational power of the neural network as often times the output that we are trying to predict has a non-linear relationship with the inputs. It can be shown that if we use a linear activation function for a hidden layer and sigmoid function for an output layer, our model becomes logistic regression model. Due to the fact that a composition of two linear functions is linear function, our area of implementing such Neural Network reduces rapidly. Rare implementation example can be solving regression problem in machine learning (where we use linear activation function in hidden layer).

The Neural Network has been developed to mimic a human brain. Though we are not there yet, neural networks are very efficient in machine learning. It was popular in the 1980s and 1990s. Recently it has become more popular. Probably because computers are fast enough to run a large neural network in a reasonable time.