Collaborating Authors

Which is your favorite Machine Learning Algorithm?


Developed back in the 50s by Rosenblatt and colleagues, this extremely simple algorithm can be viewed as the foundation for some of the most successful classifiers today, including suport vector machines and logistic regression, solved using stochastic gradient descent. The convergence proof for the Perceptron algorithm is one of the most elegant pieces of math I've seen in ML. Most useful: Boosting, especially boosted decision trees. This intuitive approach allows you to build highly accurate ML models, by combining many simple ones. Boosting is one of the most practical methods in ML, it's widely used in industry, can handle a wide variety of data types, and can be implemented at scale.

A Brief History of Machine Learning - DATAVERSITY


Machine Learning (ML) is an important aspect of modern business and research. It uses algorithms and neural network models to assist computer systems in progressively improving their performance. Machine Learning algorithms automatically build a mathematical model using sample data – also known as "training data" – to make decisions without being specifically programmed to make those decisions. Machine Learning is, in part, based on a model of brain cell interaction. The model was created in 1949 by Donald Hebb in a book titled The Organization of Behavior (PDF).

How to Implement Any Machine Learning Project with 3 Lines of Code


Before anything, make sure to understand to core components of neural networks as well as their mechanisms. There is a ton of online documentation about feedforward neural networks in their most basic form (here, here and here), make sure to check it out. To demonstrate classification, I will model constructiveness in Amazon reviews (4000 reviews) with Libra's neural network. I have worked on a similar task in a previous article and in my master's thesis (in which I describe the dataset in more details), in case you are interested. It automatically calls a data reader, fits your data, evaluates performance on a subset of your data, and displays plots of the training process.