Goto

Collaborating Authors

Ensemble Learning


Credit Card Fraud Detection

#artificialintelligence

Fraud detection is the most important step for a risk management process to prevent a recurrence. High volumes of fraud can be damaging revenue and reputation. Fortunately, it is possible to deal with fraud before it happens. Therefore, I would like to investigate the performance of the machine learning algorithms on a credit card fraud data set. The dataset contains transactions made by credit cards in September 2013 by European cardholders.


From Decision Trees and Random Forests to Gradient Boosting

#artificialintelligence

Suppose we wish to perform supervised learning on a classification problem to determine if an incoming email is spam or not spam. The spam dataset consists of 4601 emails, each labelled as real (or not spam) (0) or spam (1). The data also contains a large number of predictors (57), each of which is either a character count, or a frequency of occurrence of a certain word or symbol. In this short article, we will briefly cover the main concepts in tree based classification and compare and contrast the most popular methods. This dataset and several worked examples are covered in detail in The Elements of Statistical Learning, II edition.


Decision Trees, Random Forests, AdaBoost & XGBoost in Python

#artificialintelligence

You're looking for a complete Decision tree course that teaches you everything you need to create a Decision tree/ Random Forest/ XGBoost model in Python, right? You've found the right Decision Trees and tree based advanced techniques course! How this course will help you? A Verifiable Certificate of Completion is presented to all students who undertake this Machine learning advanced course. If you are a business manager or an executive, or a student who wants to learn and apply machine learning in Real world problems of business, this course will give you a solid base for that by teaching you some of the advanced technique of machine learning, which are Decision tree, Random Forest, Bagging, AdaBoost and XGBoost.


Imbalanced-learn: Handling imbalanced class problem

#artificialintelligence

In the previous article here, we have gone through the different methods to deal with imbalanced data. In this article, let us try to understand how to use imbalanced-learn library to deal with imbalanced class problems. We will make use of Pycaret library and UCI's default of credit card client dataset which is also in-built into PyCaret. Imbalanced-learn is a python package that provides a number of re-sampling techniques to deal with class imbalance problems commonly encountered in classification tasks. Note that imbalanced-learn is compatible with scikit-learn and is also part of scikit-learn-contrib projects.


Complete Guide To XGBoost With Implementation In R

#artificialintelligence

In recent times, ensemble techniques have become popular among data scientists and enthusiasts. Until now Random Forest and Gradient Boosting algorithms were winning the data science competitions and hackathons, over the period of the last few years XGBoost has been performing better than other algorithms on problems involving structured data. Apart from its performance, XGBoost is also recognized for its speed, accuracy and scale. XGBoost is developed on the framework of Gradient Boosting. Just like other boosting algorithms XGBoost uses decision trees for its ensemble model.


Pay as you go machine learning inference with AWS Lambda

#artificialintelligence

This post is courtesy of Eitan Sela, Senior Startup Solutions Architect. Many customers want to deploy machine learning models for real-time inference, and pay only for what they use. Using Amazon EC2 instances for real-time inference may not be cost effective to support sporadic inference requests throughout the day. AWS Lambda is a serverless compute service with pay-per-use billing. However, ML frameworks like XGBoost are too large to fit into the 250 MB application artifact size limit, or the 512 MB /tmp space limit.


Fast Gradient Boosting with CatBoost

#artificialintelligence

In gradient boosting, predictions are made from an ensemble of weak learners. Unlike a random forest that creates a decision tree for each sample, in gradient boosting, trees are created one after the other. Previous trees in the model are not altered. Results from the previous tree are used to improve the next one. In this piece, we'll take a closer look at a gradient boosting library called CatBoost.


Fast Gradient Boosting with CatBoost - KDnuggets

#artificialintelligence

In gradient boosting, predictions are made from an ensemble of weak learners. Unlike a random forest that creates a decision tree for each sample, in gradient boosting, trees are created one after the other. Previous trees in the model are not altered. Results from the previous tree are used to improve the next one. In this piece, we'll take a closer look at a gradient boosting library called CatBoost.


Random Forests

#artificialintelligence

Random forests can also be used to identify likely fraudulent transactions. For example, each transaction in a bank has a series of features such as the deviation from the mean transaction volume of the customer, the time of day, the location, and how these values differ from that customer's usual habits. This allows a bank to build a sophisticated model to predict the likelihood of a given transaction being fraudulent. If the probability of fraud exceeds a threshold, such as 50%, the bank can take action, such as freezing the card.


Random Forests Classifiers in Python

#artificialintelligence

If you are not yet familiar with Tree-Based Models in Machine Learning, you should take a look at our R course on the subject. Let's understand the algorithm in layman's terms. Suppose you want to go on a trip and you would like to travel to a place which you will enjoy. So what do you do to find a place that you will like? You can search online, read reviews on travel blogs and portals, or you can also ask your friends.