A typical question asked by a beginner, when facing a wide variety of machine learning algorithms, is "which algorithm should I use?" Even an experienced data scientist cannot tell which algorithm will perform the best before trying different algorithms. We are not advocating a one and done approach, but we do hope to provide some guidance on which algorithms to try first depending on some clear factors.

A typical question asked by a beginner, when facing a wide variety of machine learning algorithms, is "which algorithm should I use?" Even an experienced data scientist cannot tell which algorithm will perform the best before trying different algorithms. We are not advocating a one and done approach, but we do hope to provide some guidance on which algorithms to try first depending on some clear factors. Click on the picture below to zoom in. To read more, click here.

This was the subject of a question asked on Quora: What are the top 10 data mining or machine learning algorithms? Some modern algorithms such as collaborative filtering, recommendation engine, segmentation, or attribution modeling, are missing from the lists below. Algorithms from graph theory (to find the shortest path in a graph, or to detect connected components), from operations research (the simplex, to optimize the supply chain), or from time series, are not listed either. And I could not find MCM (Markov Chain Monte Carlo) and related algorithms used to process hierarchical, spatio-temporal and other Bayesian models. My point of view is of course biased, but I would like to also add some algorithms developed or re-developed at the Data Science Central's research lab: These algorithms are described in the article What you wont learn in statistics classes.

One question that always pops up in any machine learning problem: Which algorithm should I use? What do the algorithms do anyways? After briefly going over a typical machine learning process, we have a closer look at third step, i.e. building the model: What algorithms are out there? Which one should we use? One of Microsoft's Data Scientist, Brandon Rohrer, has written a nice three-part blog series on introducing data science with no jargon: Furthermore, there is one really neat cheat sheet created by Microsoft's Data Science team on when to use which algorithm: Finally, one last resource that I hihgly recommend: Top 10 data mining algorithms in plain English.