Goto

Collaborating Authors

 Ensemble Learning


Fast Gradient Boosting with CatBoost

#artificialintelligence

In gradient boosting, predictions are made from an ensemble of weak learners. Unlike a random forest that creates a decision tree for each sample, in gradient boosting, trees are created one after the other. Previous trees in the model are not altered. Results from the previous tree are used to improve the next one. In this piece, we'll take a closer look at a gradient boosting library called CatBoost.


Wasserstein Random Forests and Applications in Heterogeneous Treatment Effects

arXiv.org Machine Learning

We present new insights into causal inference in the context of Heterogeneous Treatment Effects by proposing natural variants of Random Forests to estimate the key conditional distributions. To achieve this, we recast Breiman's original splitting criterion in terms of Wasserstein distances between empirical measures. This reformulation indicates that Random Forests are well adapted to estimate conditional distributions and provides a natural extension of the algorithm to multivariate outputs. Following the philosophy of Breiman's construction, we propose some variants of the splitting rule that are well-suited to the conditional distribution estimation problem. Some preliminary theoretical connections are established along with various numerical experiments, which show how our approach may help to conduct more transparent causal inference in complex situations.


Time Series Extrinsic Regression

arXiv.org Machine Learning

This paper studies Time Series Extrinsic Regression (TSER): a regression task of which the aim is to learn the relationship between a time series and a continuous scalar variable; a task closely related to time series classification (TSC), which aims to learn the relationship between a time series and a categorical class label. This task generalizes time series forecasting (TSF), relaxing the requirement that the value predicted be a future value of the input series or primarily depend on more recent values. In this paper, we motivate and study this task, and benchmark existing solutions and adaptations of TSC algorithms on a novel archive of 19 TSER datasets which we have assembled. Our results show that the state-of-the-art TSC algorithm Rocket, when adapted for regression, achieves the highest overall accuracy compared to adaptations of other TSC algorithms and state-of-the-art machine learning (ML) algorithms such as XGBoost, Random Forest and Support Vector Regression. More importantly, we show that much research is needed in this field to improve the accuracy of ML models. We also find evidence that further research has excellent prospects of improving upon these straightforward baselines.


Fast Gradient Boosting with CatBoost - KDnuggets

#artificialintelligence

In gradient boosting, predictions are made from an ensemble of weak learners. Unlike a random forest that creates a decision tree for each sample, in gradient boosting, trees are created one after the other. Previous trees in the model are not altered. Results from the previous tree are used to improve the next one. In this piece, we'll take a closer look at a gradient boosting library called CatBoost.


Causal Transfer Random Forest: Combining Logged Data and Randomized Experiments for Robust Prediction

arXiv.org Machine Learning

It is often critical for prediction models to be robust to distributional shifts between training and testing data. Viewed from a causal perspective, the challenge is to distinguish the stable causal relationships from the unstable spurious correlations across shifts. We describe a causal transfer random forest (CTRF) that combines existing training data with a small amount of data from a randomized experiment to train a model which is robust to the feature shifts and therefore transfers to a new targeting distribution. Theoretically, we justify the robustness of the approach against feature shifts with the knowledge from causal learning. Empirically, we evaluate the CTRF using both synthetic data experiments and real-world experiments in the Bing Ads platform, including a click prediction task and in the context of an end-to-end counterfactual optimization system. The proposed CTRF produces robust predictions and outperforms most baseline methods compared in the presence of feature shifts.


Random Forests

#artificialintelligence

Random forests can also be used to identify likely fraudulent transactions. For example, each transaction in a bank has a series of features such as the deviation from the mean transaction volume of the customer, the time of day, the location, and how these values differ from that customer's usual habits. This allows a bank to build a sophisticated model to predict the likelihood of a given transaction being fraudulent. If the probability of fraud exceeds a threshold, such as 50%, the bank can take action, such as freezing the card.


Interpretable Machine Learning with an Ensemble of Gradient Boosting Machines

arXiv.org Machine Learning

A method for the local and global interpretation of a black-box model on the basis of the well-known generalized additive models is proposed. It can be viewed as an extension or a modification of the algorithm using the neural additive model. The method is based on using an ensemble of gradient boosting machines (GBMs) such that each GBM is learned on a single feature and produces a shape function of the feature. The ensemble is composed as a weighted sum of separate GBMs resulting a weighted sum of shape functions which form the generalized additive model. GBMs are built in parallel using randomized decision trees of depth 1, which provide a very simple architecture. Weights of GBMs as well as features are computed in each iteration of boosting by using the Lasso method and then updated by means of a specific smoothing procedure. In contrast to the neural additive model, the method provides weights of features in the explicit form, and it is simply trained. A lot of numerical experiments with an algorithm implementing the proposed method on synthetic and real datasets demonstrate its efficiency and properties for local and global interpretation.


Machine learning for the diagnosis of Parkinson's disease: A systematic review

arXiv.org Machine Learning

Diagnosis of Parkinson's disease (PD) is commonly based on medical observations and assessment of clinical signs, including the characterization of a variety of motor symptoms. However, traditional diagnostic approaches may suffer from subjectivity as they rely on the evaluation of movements that are sometimes subtle to human eyes and therefore difficult to classify, leading to possible misclassification. In the meantime, early non-motor symptoms of PD may be mild and can be caused by many other conditions. Therefore, these symptoms are often overlooked, making diagnosis of PD at an early stage challenging. To address these difficulties and to refine the diagnosis and assessment procedures of PD, machine learning methods have been implemented for the classification of PD and healthy controls or patients with similar clinical presentations (e.g., movement disorders or other Parkinsonian syndromes). To provide a comprehensive overview of data modalities and machine learning methods that have been used in the diagnosis and differential diagnosis of PD, in this study, we conducted a systematic literature review of studies published until February 14, 2020, using the PubMed and IEEE Xplore databases. A total of 209 studies were included, extracted for relevant information and presented in this systematic review, with an investigation of their aims, sources of data, types of data, machine learning methods and associated outcomes. These studies demonstrate a high potential for adaptation of machine learning methods and novel biomarkers in clinical decision making, leading to increasingly systematic, informed diagnosis of PD.


A Generalized Stacking for Implementing Ensembles of Gradient Boosting Machines

arXiv.org Machine Learning

The gradient boosting machine is one of the powerful tools for solving regression problems. In order to cope with its shortcomings, an approach for constructing ensembles of gradient boosting models is proposed. The main idea behind the approach is to use the stacking algorithm in order to learn a second-level meta-model which can be regarded as a model for implementing various ensembles of gradient boosting models. First, the linear regression of the gradient boosting models is considered as a simplest realization of the meta-model under condition that the linear model is differentiable with respect to its coefficients (weights). Then it is shown that the proposed approach can be simply extended on arbitrary differentiable combination models, for example, on neural networks which are differentiable and can implement arbitrary functions of gradient boosting models. Various numerical examples illustrate the proposed approach.


Random Forests Classifiers in Python

#artificialintelligence

If you are not yet familiar with Tree-Based Models in Machine Learning, you should take a look at our R course on the subject. Let's understand the algorithm in layman's terms. Suppose you want to go on a trip and you would like to travel to a place which you will enjoy. So what do you do to find a place that you will like? You can search online, read reviews on travel blogs and portals, or you can also ask your friends.