AITopics | Ensemble Learning

Collaborating Authors

Ensemble Learning

Ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

A Machine Learning Early Warning System: Multicenter Validation in Brazilian Hospitals

Kobylarz, Jhonatan, Santos, Henrique D. P. dos, Barletta, Felipe, da Silva, Mateus Cichelero, Vieira, Renata, Morales, Hugo M. P., Rocha, Cristian da Costa

arXiv.org Machine LearningJun-9-2020

Early recognition of clinical deterioration is one of the main steps for reducing inpatient morbidity and mortality. The challenging task of clinical deterioration identification in hospitals lies in the intense daily routines of healthcare practitioners, in the unconnected patient data stored in the Electronic Health Records (EHRs) and in the usage of low accuracy scores. Since hospital wards are given less attention compared to the Intensive Care Unit, ICU, we hypothesized that when a platform is connected to a stream of EHR, there would be a drastic improvement in dangerous situations awareness and could thus assist the healthcare team. With the application of machine learning, the system is capable to consider all patient's history and through the use of high-performing predictive models, an intelligent early warning system is enabled. In this work we used 121,089 medical encounters from six different hospitals and 7,540,389 data points, and we compared popular ward protocols with six different scalable machine learning methods (three are classic machine learning models, logistic and probabilistic-based models, and three gradient boosted models). The results showed an advantage in AUC (Area Under the Receiver Operating Characteristic Curve) of 25 percentage points in the best Machine Learning model result compared to the current state-of-the-art protocols. This is shown by the generalization of the algorithm with leave-one-group-out (AUC of 0.949) and the robustness through cross-validation (AUC of 0.961). We also perform experiments to compare several window sizes to justify the use of five patient timestamps. A sample dataset, experiments, and code are available for replicability purposes.

artificial intelligence, deterioration, machine learning, (15 more...)

arXiv.org Machine Learning

2006.05514

Country:

South America > Brazil > Paraná > Curitiba (0.05)
Asia > Singapore (0.04)
South America > Brazil > Rio Grande do Sul > Porto Alegre (0.04)
Europe > Portugal (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.89)

Industry:

Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.90)

Add feedback

Fair Bayesian Optimization

Perrone, Valerio, Donini, Michele, Kenthapadi, Krishnaram, Archambeau, Cédric

arXiv.org Machine LearningJun-9-2020

Given the increasing importance of machine learning (ML) in our lives, algorithmic fairness techniques have been proposed to mitigate biases that can be amplified by ML. Commonly, these specialized techniques apply to a single family of ML models and a specific definition of fairness, limiting their effectiveness in practice. We introduce a general constrained Bayesian optimization (BO) framework to optimize the performance of any ML model while enforcing one or multiple fairness constraints. BO is a global optimization method that has been successfully applied to automatically tune the hyperparameters of ML models. We apply BO with fairness constraints to a range of popular models, including random forests, gradient boosting, and neural networks, showing that we can obtain accurate and fair solutions by acting solely on the hyperparameters. We also show empirically that our approach is competitive with specialized techniques that explicitly enforce fairness constraints during training, and outperforms preprocessing methods that learn unbiased representations of the input data. Moreover, our method can be used in synergy with such specialized fairness techniques to tune their hyperparameters. Finally, we study the relationship between hyperparameters and fairness of the generated model. We observe a correlation between regularization and unbiased models, explaining why acting on the hyperparameters leads to ML models that generalize well and are fair.

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Machine Learning

2006.05109

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Towards an Argument Mining Pipeline Transforming Texts to Argument Graphs

Lenz, Mirko, Sahitaj, Premtim, Kallenberg, Sean, Coors, Christopher, Dumani, Lorik, Schenkel, Ralf, Bergmann, Ralph

arXiv.org Artificial IntelligenceJun-8-2020

This paper targets the automated extraction of components of argumentative information and their relations from natural language text. Moreover, we address a current lack of systems to provide complete argumentative structure from arbitrary natural language text for general usage. We present an argument mining pipeline as a universally applicable approach for transforming German and English language texts to graph-based argument representations. We also introduce new methods for evaluating the results based on existing benchmark argument structures. Our results show that the generated argument graphs can be beneficial to detect new connections between different statements of an argumentative text. Our pipeline implementation is publicly available on GitHub.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2006.04562

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Germany (0.04)
Asia > China (0.04)
(10 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Government (0.68)
Media (0.46)
Health & Medicine (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Soft Gradient Boosting Machine

Feng, Ji, Xu, Yi-Xuan, Jiang, Yuan, Zhou, Zhi-Hua

arXiv.org Machine LearningJun-7-2020

Gradient Boosting Machine has proven to be one successful function approximator and has been widely used in a variety of areas. However, since the training procedure of each base learner has to take the sequential order, it is infeasible to parallelize the training process among base learners for speed-up. In addition, under online or incremental learning settings, GBMs achieved sub-optimal performance due to the fact that the previously trained base learners can not adapt with the environment once trained. In this work, we propose the soft Gradient Boosting Machine (sGBM) by wiring multiple differentiable base learners together, by injecting both local and global objectives inspired from gradient boosting, all base learners can then be jointly optimized with linear speed-up. When using differentiable soft decision trees as base learner, such device can be regarded as an alternative version of the (hard) gradient boosting decision trees with extra benefits. Experimental results showed that, sGBM enjoys much higher time efficiency with better accuracy, given the same base learner in both on-line and off-line settings.

artificial intelligence, base learner, machine learning, (16 more...)

arXiv.org Machine Learning

2006.04059

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Random Forests (and Extremely) in Python with scikit-learn

#artificialintelligenceJun-4-2020, 03:23:44 GMT

In this guest post, you will learn by example how to do two popular machine learning techniques called random forest and extremely random forests. In fact, this post is an excerpt (adapted to the blog format) from the forthcoming Artificial Intelligence with Python – Second Edition: Your Complete Guide to Building Intelligent Apps using Python 3.x and TensorFlow 2. Now, before you will learn how to carry out random forests in Python with scikit-learn, you will find some brief information about the book. The new edition of this book, which will guide you to artificial intelligence with Python, is now updated to Python 3.x and TensorFlow 2. Furthermore, it has new chapters that, besides random forests, cover recurrent neural networks, artificial intelligence and Big Data, fundamental use cases, chatbots, and more. Finally, artificial Intelligence with Python – Second Edition is written by two experts in the field of artificial intelligence; Alberto Artasanches and Pratek Joshi (more information about the authors can be found towards the end of the post). Now, in the next section of this post, you will learn what random forests and extremely random forests are.

artificial intelligence, machine learning, random forest, (13 more...)

#artificialintelligence

Country: North America > United States > California (0.05)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

COVID-19 diagnosis by routine blood tests using machine learning

Kukar, Matjaž, Gunčar, Gregor, Vovko, Tomaž, Podnar, Simon, Černelč, Peter, Brvar, Miran, Zalaznik, Mateja, Notar, Mateja, Moškon, Sašo, Notar, Marko

arXiv.org Machine LearningJun-4-2020

Physicians taking care of patients with coronavirus disease (COVID-19) have described different changes in routine blood parameters. However, these changes, hinder them from performing COVID-19 diagnosis. We constructed a machine learning predictive model for COVID-19 diagnosis. The model was based and cross-validated on the routine blood tests of 5,333 patients with various bacterial and viral infections, and 160 COVID-19-positive patients. We selected operational ROC point at a sensitivity of 81.9% and specificity of 97.9%. The cross-validated area under the curve (AUC) was 0.97. The five most useful routine blood parameters for COVID19 diagnosis according to the feature importance scoring of the XGBoost algorithm were MCHC, eosinophil count, albumin, INR, and prothrombin activity percentage. tSNE visualization showed that the blood parameters of the patients with severe COVID-19 course are more like the parameters of bacterial than viral infection. The reported diagnostic accuracy is at least comparable and probably complementary to RT-PCR and chest CT studies. Patients with fever, cough, myalgia, and other symptoms can now have initial routine blood tests assessed by our diagnostic tool. All patients with a positive COVID-19 prediction would then undergo standard RT-PCR studies to confirm the diagnosis. We believe that our results present a significant contribution to improvements in COVID-19 diagnosis.

artificial intelligence, diagnosis, machine learning, (16 more...)

arXiv.org Machine Learning

2006.03476

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.05)
(10 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Gradient Boosting Application in Forecasting of Performance Indicators Values for Measuring the Efficiency of Promotions in FMCG Retail

Henzel, Joanna, Sikora, Marek

arXiv.org Machine LearningMay-30-2020

In the paper, a problem of forecasting promotion efficiency is raised. The authors propose a new approach, using the gradient boosting method for this task. Six performance indicators are introduced to capture the promotion effect. For each of them, within predefined groups of products, a model was trained. A description of using these models for forecasting and optimising promotion efficiency is provided. Data preparation and hyperparameters tuning processes are also described. The experiments were performed for three groups of products from a large grocery company.

artificial intelligence, machine learning, promotion, (17 more...)

arXiv.org Machine Learning

2006.04945

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Europe > Poland (0.04)
Asia > India (0.04)

Genre: Research Report (1.00)

Industry: Retail (0.95)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.87)

Add feedback

Distributional Random Forests: Heterogeneity Adjustment and Multivariate Distributional Regression

Ćevid, Domagoj, Michel, Loris, Meinshausen, Nicolai, Bühlmann, Peter

arXiv.org Machine LearningMay-29-2020

We propose an adaptation of the Random Forest algorithm to estimate the conditional distribution of a possibly multivariate response. We suggest a new splitting criterion based on the MMD two-sample test, which is suitable for detecting heterogeneity in multivariate distributions. The weights provided by the forest can be conveniently used as an input to other methods in order to locally solve various learning problems. The code is available as \texttt{R}-package \texttt{drf}.

artificial intelligence, criterion, machine learning, (19 more...)

arXiv.org Machine Learning

2005.14458

Country:

Europe > Switzerland > Zürich > Zürich (0.05)
North America > United States > Hawaii (0.04)
North America > United States > District of Columbia (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry:

Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.62)

Add feedback

Travel Time Prediction using Tree-Based Ensembles

Huang, He, Pouls, Martin, Meyer, Anne, Pauly, Markus

arXiv.org Machine LearningMay-28-2020

In this paper, we consider the task of predicting travel times between two arbitrary points in an urban scenario. We view this problem from two temporal perspectives: long-term forecasting with a horizon of several days and short-term forecasting with a horizon of one hour. Both of these perspectives are relevant for planning tasks in the context of urban mobility and transportation services. We utilize tree-based ensemble methods that we train and evaluate on a dataset of taxi trip records from New York City. Through extensive data analysis, we identify relevant temporal and spatial features. We also engineer additional features based on weather and routing data. The latter is obtained via a routing solver operating on the road network. The computational results show that the addition of this routing data can be beneficial to the model performance. Moreover, employing different models for short and long-term prediction is useful as short-term models are better suited to mirror current traffic conditions. In fact, we show that accurate short-term predictions may be obtained with only little training data.

artificial intelligence, machine learning, prediction, (15 more...)

arXiv.org Machine Learning

2005.13818

Country:

North America > United States > New York (0.25)
Europe > Germany > North Rhine-Westphalia > Arnsberg Region > Dortmund (0.04)
North America > United States > Delaware > New Castle County > Wilmington (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Out-of-Core GPU Gradient Boosting

Ou, Rong

arXiv.org Machine LearningMay-18-2020

GPU-based algorithms have greatly accelerated many machine learning methods; however, GPU memory is typically smaller than main memory, limiting the size of training data. In this paper, we describe an out-of-core GPU gradient boosting algorithm implemented in the XGBoost library. We show that much larger datasets can fit on a given GPU, without degrading model accuracy or training time. To the best of our knowledge, this is the first out-of-core GPU implementation of gradient boosting. Similar approaches can be applied to other machine learning algorithms

artificial intelligence, gradient, machine learning, (17 more...)

arXiv.org Machine Learning

2005.09148

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Santa Clara County > Santa Clara (0.04)

Genre: Research Report (0.40)

Industry: Information Technology (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)

Add feedback