The feedforward neural network is the simplest type of artificial neural network which has lots of applications in machine learning. It was the first type of neural network ever created, and a firm understanding of this network can help you understand the more complicated architectures like convolutional or recurrent neural nets. This article is inspired by the Deep Learning Specialization course of Andrew Ng in Coursera, and I have used a similar notation to describe the neural net architecture and the related mathematical equations. This course is a very good online resource to start learning about neural nets, but since it was created for a broad range of audiences, some of the mathematical details have been omitted. In this article, I will try to derive all the mathematical equations that describe the feedforward neural net. Currently Medium supports superscripts only for numbers, and it has no support for subscripts. So to write the name of the variables, I use this notation: Every character after is a superscript character and every character after _ (and before if its present) is a subscript character. A neuron is the foundational unit of our brain. The brain is estimated to have around 100 billion neurons, and this massive biological network enables us to think and perceive the world around us. Basically what a neuron does is receiving information from other neurons, processing this information and sending the result to other neurons. This process is shown in Figure 1. A single neuron has some inputs which are received throughout the dendrites. These inputs are summed together in the cell body and transformed into a signal that is sent to other neurons through the axon. The axon is connected to the dendrites of other neurons by synapses. The synapse can act as a weight and make the signal passing through it stronger or weaker based on how often that connection is used. This biological understanding of the neuron can be translated into a mathematical model as shown in Figure 1. There are different activation functions that you can use in a neural net, and some of them which are used more commonly are discussed below. A binary step function is a threshold-based activation function. If the function's input (z) is less than or equal to zero, the output of the neuron is zero and if it is above zero, the output is 1 The step function is not differentiable at point z 0, and its derivative is zero at all the other points. Figure 1 shows a plot of the step function and its derivative. This function is shown in Figure 2 (left). The use of prime for g signifies differentiation with respect to the argument which is z here. It is a non-linear activation function that gives a continuous output in the range of 0 to 1. Sigmoid has the property of being similar to the step function, however, it is continuous and prevents the jump in the output values that exists in the step function.
How do Neural Networks learn? Take a whirlwind tour of Neural Network architectures Train Neural Networks Optimize Neural Network to achieve SOTA performance Weights & Biases How You Can Train Your Own Neural Nets 3. The Codebit.ly/keras-neural-nets 4. The Goal For Today Code 5. Basic Neural Network Architecture 6. This is the number of features your neural network uses to make its predictions. The input vector needs one input neuron per feature. You want to carefully select these features and remove any that may contain patterns that won't generalize beyond the training set (and cause overfitting).