How to Set Up Distributed XGBoost on MapR-FS

#artificialintelligence 

XGBoost is a library that is designed for boosted (tree) algorithms. It has become a popular machine learning framework among data science practitioners, especially on Kaggle, which is a platform for data prediction competitions where researchers post their data and statisticians and data miners compete to produce the best models. For structured learning problems on Kaggle, it can be difficult to get into the top 10 without including XGBoost. Typically, data scientists use multi-thread single machines to train XGBoost models. Very few people have deployed XGBoost on a distributed environment and achieved good performance.