Beginners Tutorial on XGBoost and Parameter Tuning in R

#artificialintelligence

Last week, we learned about Random Forest Algorithm. Now we know it helps us reduce a model's variance by building models on resampled data and thereby increases its generalization capability.


Classification with Scikit-Learn

#artificialintelligence

For python programmers, scikit-learn is one of the best libraries to build Machine Learning applications with. It is ideal for beginners because it has a really simple interface, it is well documented with many examples and tutorials. Besides supervised machine learning (classification and regression), it can also be used for clustering, dimensionality reduction, feature extraction and engineering, and pre-processing the data. The interface is consistent over all of these methods, so it is not only easy to use, but it is also easy to construct a large ensemble of classifiers/regression models and train them with the same commands. In this blog lets have a look at how to build, train, evaluate and validate a classifier with scikit-learn and in this way get familiar with the scikit-learn library.


Data Science with Python: Exploratory Analysis with Movie-Ratings and Fraud Detection with Credit-Card Transactions

@machinelearnbot

The following problems are taken from the projects / assignments in the edX course Python for Data Science (UCSanDiagoX) and the coursera course Applied Machine Learning in Python (UMich).


Building an Employee Churn Model in Python to Develop a Strategic Retention Plan

#artificialintelligence

Employee turn-over (also known as "employee churn") is a costly problem for companies. The true cost of replacing an employee can often be quite large. A study by the Center for American Progress found that companies typically pay about one-fifth of an employee's salary to replace that employee, and the cost can significantly increase if executives or highest-paid employees are to be replaced. In other words, the cost of replacing employees for most employers remains significant. This is due to the amount of time spent to interview and find a replacement, sign-on bonuses, and the loss of productivity for several months while the new employee gets accustomed to the new role.


Predicting the Higgs-Boson Signal

@machinelearnbot

They are currently in the NYC Data Science Academy 12 week full time Data Science Bootcamp program taking place between January 11th to April 1st, 2016. This post is based on their fourth class project - Machine learning(due on the 8th week of the program).