AITopics | Ensemble Learning

Collaborating Authors

Ensemble Learning

Ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

The Impossibility of Parallelizing Boosting

Karbasi, Amin, Larsen, Kasper Green

arXiv.org Artificial IntelligenceAug-21-2023

Boosting is one of the most successful ideas in machine learning, allowing one to "boost" the performance of a base learning algorithm with rather poor accuracy into a highly accurate classifier, with recent applications in adversarial training [1], reinforcement learning [5], and federated learning [27], among many others. The classic boosting algorithm, known as AdaBoost [8], achieves this by iteratively training classifers on the training data set. After each iteration, the data set is reweighed and a new classifier is trained using a weighted loss function.

artificial intelligence, exp, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2301.09627

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy (0.04)
Europe > Denmark (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.46)

Add feedback

LCE: An Augmented Combination of Bagging and Boosting in Python

Fauvel, Kevin, Fromont, Élisa, Masson, Véronique, Faverdin, Philippe, Termier, Alexandre

arXiv.org Artificial IntelligenceAug-15-2023

The package implements Local Cascade Ensemble (LCE), a machine learning method that further enhances the prediction performance of the current state-of-the-art methods Random Forest and XGBoost. LCE combines their strengths and adopts a complementary diversification approach to obtain a better generalizing predictor. The package is compatible with scikit-learn, therefore it can interact with scikit-learn pipelines and model selection tools.

artificial intelligence, lce, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2308.0725

Country: Europe > France (0.06)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)

Add feedback

Can we Agree? On the Rash\=omon Effect and the Reliability of Post-Hoc Explainable AI

Poiret, Clement, Grigis, Antoine, Thomas, Justin, Noulhiane, Marion

arXiv.org Artificial IntelligenceAug-14-2023

The Rash\=omon effect poses challenges for deriving reliable knowledge from machine learning models. This study examined the influence of sample size on explanations from models in a Rash\=omon set using SHAP. Experiments on 5 public datasets showed that explanations gradually converged as the sample size increased. Explanations from <128 samples exhibited high variability, limiting reliable knowledge extraction. However, agreement between models improved with more data, allowing for consensus. Bagging ensembles often had higher agreement. The results provide guidance on sufficient data to trust explanations. Variability at low samples suggests that conclusions may be unreliable without validation. Further work is needed with more model types, data domains, and explanation methods. Testing convergence in neural networks and with model-specific explanation methods would be impactful. The approaches explored here point towards principled techniques for eliciting knowledge from ambiguous models.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2308.07247

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > Greece > West Greece > Patra (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Health & Medicine > Therapeutic Area (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.88)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Digital elevation model correction in urban areas using extreme gradient boosting, land cover and terrain parameters

Okolie, Chukwuma, Mills, Jon, Adeleke, Adedayo, Smit, Julian

arXiv.org Artificial IntelligenceAug-12-2023

The accuracy of digital elevation models (DEMs) in urban areas is influenced by numerous factors including land cover and terrain irregularities. Moreover, building artifacts in global DEMs cause artificial blocking of surface flow pathways. This compromises their quality and adequacy for hydrological and environmental modelling in urban landscapes where precise and accurate terrain information is needed. In this study, the extreme gradient boosting (XGBoost) ensemble algorithm is adopted for enhancing the accuracy of two medium-resolution 30m DEMs over Cape Town, South Africa: Copernicus GLO-30 and ALOS World 3D (AW3D). XGBoost is a scalable, portable and versatile gradient boosting library that can solve many environmental modelling problems. The training datasets are comprised of eleven predictor variables including elevation, urban footprints, slope, aspect, surface roughness, topographic position index, terrain ruggedness index, terrain surface texture, vector roughness measure, forest cover and bare ground cover. The target variable (elevation error) was calculated with respect to highly accurate airborne LiDAR. After training and testing, the model was applied for correcting the DEMs at two implementation sites. The correction achieved significant accuracy gains which are competitive with other proposed methods. The root mean square error (RMSE) of Copernicus DEM improved by 46 to 53% while the RMSE of AW3D DEM improved by 72 to 73%. These results showcase the potential of gradient boosted trees for enhancing the quality of DEMs, and for improved hydrological modelling in urban catchments.

artificial intelligence, cape town, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2308.06545

Country:

Africa > South Africa > Western Cape > Cape Town (0.28)
Europe > Austria > Vienna (0.14)
North America > United States > New York > New York County > New York City (0.04)
(9 more...)

Genre: Research Report (0.70)

Industry: Energy > Renewable (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)

Add feedback

Safety in Traffic Management Systems: A Comprehensive Survey

Du, Wenlu, Dash, Ankan, Li, Jing, Wei, Hua, Wang, Guiling

arXiv.org Artificial IntelligenceAug-11-2023

Traffic management systems play a vital role in ensuring safe and efficient transportation on roads. However, the use of advanced technologies in traffic management systems has introduced new safety challenges. Therefore, it is important to ensure the safety of these systems to prevent accidents and minimize their impact on road users. In this survey, we provide a comprehensive review of the literature on safety in traffic management systems. Specifically, we discuss the different safety issues that arise in traffic management systems, the current state of research on safety in these systems, and the techniques and methods proposed to ensure the safety of these systems. We also identify the limitations of the existing research and suggest future research directions.

data mining, machine learning, reinforcement learning, (25 more...)

arXiv.org Artificial Intelligence

doi: 10.3390/designs7040100

2308.06204

Country:

Asia > India > West Bengal > Kolkata (0.04)
Asia > China > Shanghai > Shanghai (0.04)
South America > Brazil (0.04)
(12 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
Leisure & Entertainment > Games (0.67)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
(12 more...)

Add feedback

Symmetry Defense Against XGBoost Adversarial Perturbation Attacks

Lindqvist, Blerta

arXiv.org Artificial IntelligenceAug-10-2023

We examine whether symmetry can be used to defend tree-based ensemble classifiers such as gradient-boosting decision trees (GBDTs) against adversarial perturbation attacks. The idea is based on a recent symmetry defense for convolutional neural network classifiers (CNNs) that utilizes CNNs' lack of invariance with respect to symmetries. CNNs lack invariance because they can classify a symmetric sample, such as a horizontally flipped image, differently from the original sample. CNNs' lack of invariance also means that CNNs can classify symmetric adversarial samples differently from the incorrect classification of adversarial samples. Using CNNs' lack of invariance, the recent CNN symmetry defense has shown that the classification of symmetric adversarial samples reverts to the correct sample classification. In order to apply the same symmetry defense to GBDTs, we examine GBDT invariance and are the first to show that GBDTs also lack invariance with respect to symmetries. We apply and evaluate the GBDT symmetry defense for nine datasets against six perturbation attacks with a threat model that ranges from zero-knowledge to perfect-knowledge adversaries. Using the feature inversion symmetry against zero-knowledge adversaries, we achieve up to 100% accuracy on adversarial samples even when default and robust classifiers have 0% accuracy. Using the feature inversion and horizontal flip symmetries against perfect-knowledge adversaries, we achieve up to over 95% accuracy on adversarial samples for the GBDT classifier of the F-MNIST dataset even when default and robust classifiers have 0% accuracy.

adversarial sample, adversary, classifier, (15 more...)

arXiv.org Artificial Intelligence

2308.05575

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Unleashing the Power of Extra-Tree Feature Selection and Random Forest Classifier for Improved Survival Prediction in Heart Failure Patients

Talukder, Md. Simul Hasan, Sulaiman, Rejwan Bin, Angon, Mouli Bardhan Paul

arXiv.org Artificial IntelligenceAug-9-2023

Heart failure is a life-threatening condition that affects millions of people worldwide. The ability to accurately predict patient survival can aid in early intervention and improve patient outcomes. In this study, we explore the potential of utilizing data pre-processing techniques and the Extra-Tree (ET) feature selection method in conjunction with the Random Forest (RF) classifier to improve survival prediction in heart failure patients. By leveraging the strengths of ET feature selection, we aim to identify the most significant predictors associated with heart failure survival. Using the public UCL Heart failure (HF) survival dataset, we employ the ET feature selection algorithm to identify the most informative features. These features are then used as input for grid search of RF. Finally, the tuned RF Model was trained and evaluated using different matrices. The approach was achieved 98.33% accuracy that is the highest over the exiting work.

artificial intelligence, machine learning, prediction, (13 more...)

arXiv.org Artificial Intelligence

2308.05765

Country:

North America > United States (0.14)
Asia > Bangladesh (0.05)
South America > Paraguay > Asunción > Asunción (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

A machine-learning sleep-wake classification model using a reduced number of features derived from photoplethysmography and activity signals

Almeida, Douglas A., Dias, Felipe M., Toledo, Marcelo A. F., Cardenas, Diego A. C., Oliveira, Filipe A. C., Ribeiro, Estela, Krieger, Jose E., Gutierrez, Marco A.

arXiv.org Artificial IntelligenceAug-7-2023

Sleep is a crucial aspect of our overall health and well-being. It plays a vital role in regulating our mental and physical health, impacting our mood, memory, and cognitive function to our physical resilience and immune system. The classification of sleep stages is a mandatory step to assess sleep quality, providing the metrics to estimate the quality of sleep and how well our body is functioning during this essential period of rest. Photoplethysmography (PPG) has been demonstrated to be an effective signal for sleep stage inference, meaning it can be used on its own or in a combination with others signals to determine sleep stage. This information is valuable in identifying potential sleep issues and developing strategies to improve sleep quality and overall health. In this work, we present a machine learning sleep-wake classification model based on the eXtreme Gradient Boosting (XGBoost) algorithm and features extracted from PPG signal and activity counts. The performance of our method was comparable to current state-of-the-art methods with a Sensitivity of 91.15 $\pm$ 1.16%, Specificity of 53.66 $\pm$ 1.12%, F1-score of 83.88 $\pm$ 0.56%, and Kappa of 48.0 $\pm$ 0.86%. Our method offers a significant improvement over other approaches as it uses a reduced number of features, making it suitable for implementation in wearable devices that have limited computational power.

machine-learning sleep-wake classification model, photoplethysmography and activity signal

arXiv.org Artificial Intelligence

2308.05759

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Hematology (0.89)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.89)
Health & Medicine > Diagnostic Medicine > Imaging (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.53)

Add feedback

SecureBoost Hyperparameter Tuning via Multi-Objective Federated Learning

Ren, Ziyao, Kang, Yan, Fan, Lixin, Yang, Linghua, Tong, Yongxin, Yang, Qiang

arXiv.org Artificial IntelligenceAug-7-2023

SecureBoost is a tree-boosting algorithm leveraging homomorphic encryption to protect data privacy in vertical federated learning setting. It is widely used in fields such as finance and healthcare due to its interpretability, effectiveness, and privacy-preserving capability. However, SecureBoost suffers from high computational complexity and risk of label leakage. To harness the full potential of SecureBoost, hyperparameters of SecureBoost should be carefully chosen to strike an optimal balance between utility, efficiency, and privacy. Existing methods either set hyperparameters empirically or heuristically, which are far from optimal. To fill this gap, we propose a Constrained Multi-Objective SecureBoost (CMOSB) algorithm to find Pareto optimal solutions that each solution is a set of hyperparameters achieving optimal tradeoff between utility loss, training cost, and privacy leakage. We design measurements of the three objectives. In particular, the privacy leakage is measured using our proposed instance clustering attack. Experimental results demonstrate that the CMOSB yields not only hyperparameters superior to the baseline but also optimal sets of hyperparameters that can support the flexible requirements of FL participants.

hyperparameter, privacy leakage, secureboost, (13 more...)

arXiv.org Artificial Intelligence

2307.10579

Country:

Europe > Czechia > Prague (0.04)
Asia > China > Hong Kong (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.51)
Information Technology > Data Science > Data Mining > Big Data (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

Add feedback

Privacy-Preserving Tree-Based Inference with TFHE

Frery, Jordan, Stoian, Andrei, Bredehoft, Roman, Montero, Luis, Kherfallah, Celia, Chevallier-Mames, Benoit, Meyre, Arthur

arXiv.org Artificial IntelligenceAug-7-2023

Privacy enhancing technologies (PETs) have been proposed as a way to protect the privacy of data while still allowing for data analysis. In this work, we focus on Fully Homomorphic Encryption (FHE), a powerful tool that allows for arbitrary computations to be performed on encrypted data. FHE has received lots of attention in the past few years and has reached realistic execution times and correctness. More precisely, we explain in this paper how we apply FHE to tree-based models and get state-of-the-art solutions over encrypted tabular data. We show that our method is applicable to a wide range of tree-based models, including decision trees, random forests, and gradient boosted trees, and has been implemented within the Concrete-ML library, which is open-source at https://github.com/zama-ai/concrete-ml. With a selected set of use-cases, we demonstrate that our FHE version is very close to the unprotected version in terms of accuracy.

artificial intelligence, machine learning, opération, (16 more...)

arXiv.org Artificial Intelligence

2303.01254

Country:

North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre: Research Report > Promising Solution (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.93)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
(2 more...)

Add feedback