Collaborating Authors

Ensemble Learning

Machine Learning : Random Forest with Python from Scratch


Are you ready to start your path to becoming a Machine Learning expert! Are you ready to train your machine like a father trains his son! A breakthrough in Machine Learning would be worth ten Microsofts." -Bill Gates There are lots of courses and lectures out there regarding random forest. After taking this course, the curtains of machine learning and especially random forest will be lifted for you. You'll be learning a state-of-the-art algorithm in details with practical implementation.

Fraud Detection with EvalML


Data analytics has created a great impact in the banking and financial services industry, for example, by providing insights of global financial trends and financial modelling etc. Among them, fraud prevention and detection are one of the applications. This article applied predictive data analytics and supervised machine learning (ML) methods for card-not-present (CNP) fraud detection, and demonstrated modelling using EvalML, an auto machine learning library. This article also identified that both Decision Tree (DT) and XGBoost models work better than Linear models (LM), Random Forest (RF) and LightGBM models. The dataset used to demonstrate modelling is a large-scale dataset from Vesta which is available on Kaggle .

What's so special about CatBoost?


CatBoost is based on gradient boosting. A new machine learning technique developed by Yandex outperforms many existing boosting algorithms like XGBoost, Light GBM. While deep learning algorithms require lots of data and computational power, boosting algorithms are still needed for most business problems. However, boosting algorithms like XGBoost takes hours to train, and sometimes you'll get frustrated…

Four interpretable algorithms that you should use in 2022


The new year has begun, and it is the time for good resolutions. One of them could be to make decision-making processes more interpretable. To help you do this, I present four interpretable rule-based algorithms. These four algorithms share the use of ensemble of decision trees as rule generator (like Random Forest, AdaBoost, Gradient Boosting, etc.). In other words, each of these interpretable algorithms starts its process by fitting a black box model and generating an interpretable rule ensemble model.

Introduction to Binary Classification with PyCaret - KDnuggets


PyCaret is an open-source, low-code machine learning library in Python that automates machine learning workflows. It is an end-to-end machine learning and model management tool that speeds up the experiment cycle exponentially and makes you more productive. In comparison with the other open-source machine learning libraries, PyCaret is an alternate low-code library that can be used to replace hundreds of lines of code with few lines only. This makes experiments exponentially fast and efficient. PyCaret is essentially a Python wrapper around several machine learning libraries and frameworks such as scikit-learn, XGBoost, LightGBM, CatBoost, spaCy, Optuna, Hyperopt, Ray, and a few more.

[100%OFF] Machine Learning & Deep Learning in Python & R


Learn how to solve real life problem using the Machine learning techniques Machine Learning models such as Linear Regression, Logistic Regression, KNN etc. Advanced Machine Learning models such as Decision trees, XGBoost, Random Forest, SVM etc. Understanding of basics of statistics and concepts of Machine Learning How to do basic statistical operations and run ML models in Python Indepth knowledge of data collection and data preprocessing for Machine Learning problem How to convert business problem into a Machine learning problem Can I get a certificate after completing the course? Are there any other coupons available for this course? Note: 100% OFF Udemy coupon codes are valid for maximum 3 days only. Look for "ENROLL NOW" button at the end of the post. Disclosure: This post may contain affiliate links and we may get small commission if you make a purchase.

House Price Prediction using a Random Forest Classifier


In this blog post, I will use machine learning and Python for predicting house prices. I will use a Random Forest Classifier (in fact Random Forest regression). In the end, I will demonstrate my Random Forest Python algorithm! There is no law except the law that there is no law. Data Science is about discovering hidden patterns (laws) in your data.

Parallel XGBoost with Dask in Python


Out of the box, XGBoost cannot be trained on datasets larger than your computer memory; Python will throw a MemoryError. This tutorial will show you how to go beyond your local machine limitations by leveraging distributed XGBoost with Dask with only minor changes to your existing code. Here is the code we will use if you want to jump right in. By default, XGBoost trains models sequentially. This is fine for basic projects, but as the size of your dataset and/or ML model grows, you may want to consider running XGBoost in distributed mode with Dask to speed up computations and reduce the burden on your local machine.



XGBoost stands for "Extreme Gradient Boosting". It is a decision tree-based algorithm which is used in Machine Learning. XGBoost makes use of the gradient boosting framework. Decision Trees are structures that consists of a set of leaf nodes, branches as well as internal nodes. Each leaf node represents a Class Label. The internal node represents the attributes while the branches connect the leaves to these internal nodes.