Developed back in the 50s by Rosenblatt and colleagues, this extremely simple algorithm can be viewed as the foundation for some of the most successful classifiers today, including suport vector machines and logistic regression, solved using stochastic gradient descent. The convergence proof for the Perceptron algorithm is one of the most elegant pieces of math I've seen in ML. Most useful: Boosting, especially boosted decision trees. This intuitive approach allows you to build highly accurate ML models, by combining many simple ones. Boosting is one of the most practical methods in ML, it's widely used in industry, can handle a wide variety of data types, and can be implemented at scale.

Quantum computing has received significant attention as a next-generation computing technology due to its potential speed and ability to solve problems considered too difficult for classical computers, as reflected in the recent discussion on Quantum Supremacy. Grid sees quantum computing not only as a tool for solving optimization and quantum chemical computation problems, but also as a tool for AI (Machine Learning, Deep Learning, etc.) calculations, such as feature extraction. Previous works have announced the successful implementation of machine learning-related algorithms, such as principal component analysis and auto-encoders, on quantum computers. This work announces the development of a gradient descent (backpropagation) algorithm, a method commonly used in machine learning for neural network parameter optimization, for use on NISQ quantum computers. Due to the non-linear nature of quantum bits (qubits), Grid proposes that this algorithm can be used to perform the feature extraction and representation calculations that deep learning methods employ.

This article is intended for beginners in deep learning who wish to gain knowledge about probability and statistics and also as a reference for practitioners. In my previous article, I wrote about the concepts of linear algebra for deep learning in a top down approach ( link for the article) (If you do not have enough idea about linear algebra, please read that first).The same top down approach is used here.Providing the description of use cases first and then the concepts. All the example code uses python and numpy.Formulas are provided as images for reuse. Probability is the science of quantifying uncertain things.Most of machine learning and deep learning systems utilize a lot of data to learn about patterns in the data.Whenever data is utilized in a system rather than sole logic, uncertainty grows up and whenever uncertainty grows up, probability becomes relevant. By introducing probability to a deep learning system, we introduce common sense to the system.Otherwise the system would be very brittle and will not be useful.In deep learning, several models like bayesian models, probabilistic graphical models, hidden markov models are used.They depend entirely on probability concepts.