AITopics | Ensemble Learning

Collaborating Authors

Ensemble Learning

Ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Financial Fraud Detection with Entropy Computing

Emami, Babak, Dyk, Wesley, Haycraft, David, Spear, Carrie, Nguyen, Lac, Chancellor, Nicholas

arXiv.org Artificial IntelligenceMar-14-2025

We introduce CVQBoost, a novel classification algorithm that leverages early hardware implementing Quantum Computing Inc's Entropy Quantum Computing (EQC) paradigm, Dirac-3 [Nguyen et. al. arXiv:2407.04512]. We apply CVQBoost to a fraud detection test case and benchmark its performance against XGBoost, a widely utilized ML method. Running on Dirac-3, CVQBoost demonstrates a significant runtime advantage over XGBoost, which we evaluate on high-performance hardware comprising up to 48 CPUs and four NVIDIA L4 GPUs using the RAPIDS AI framework. Our results show that CVQBoost maintains competitive accuracy (measured by AUC) while significantly reducing training time, particularly as dataset size and feature complexity increase. To assess scalability, we extend our study to large synthetic datasets ranging from 1M to 70M samples, demonstrating that CVQBoost on Dirac-3 is well-suited for large-scale classification tasks. These findings position CVQBoost as a promising alternative to gradient boosting methods, offering superior scalability and efficiency for high-dimensional ML applications such as fraud detection.

artificial intelligence, cvqboost, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2503.11273

Country:

Europe (0.14)
North America > United States > New Jersey > Hudson County > Hoboken (0.05)
North America > United States > New York > New York County > New York City (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Law Enforcement & Public Safety > Fraud (0.91)
Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)

Add feedback

Optimizing Fire Safety: Reducing False Alarms Using Advanced Machine Learning Techniques

Jamal, Muhammad Hassan, Alazeb, Abdulwahab, Bakhsh, Shahid Allah, Boulila, Wadii, Shah, Syed Aziz, Khattak, Aizaz Ahmad, Khan, Muhammad Shahbaz

arXiv.org Artificial IntelligenceMar-12-2025

Fire safety practices are important to reduce the extent of destruction caused by fire. While smoke alarms help save lives, firefighters struggle with the increasing number of false alarms. This paper presents a precise and efficient Weighted ensemble model for decreasing false alarms. It estimates the density, computes weights according to the high and low-density regions, forwards the high region weights to KNN and low region weights to XGBoost and combines the predictions. The proposed model is effective at reducing response time, increasing fire safety, and minimizing the damage that fires cause. A specifically designed dataset for smoke detection is utilized to test the proposed model. In addition, a variety of ML models, such as Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), Nai:ve Bayes (NB), K-Nearest Neighbour (KNN), Support Vector Machine (SVM), Extreme Gradient Boosting (XGBoost), Adaptive Boosting (ADAB), have also been utilized. To maximize the use of the smoke detection dataset, all the algorithms utilize the SMOTE re-sampling technique. After evaluating the assessment criteria, this paper presents a concise summary of the comprehensive findings obtained by comparing the outcomes of all models.

dataset, ensemble model, weighted ensemble model, (11 more...)

arXiv.org Artificial Intelligence

2503.0996

Country:

Asia > Middle East > Saudi Arabia > Najran Province > Najran (0.04)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
Asia > Pakistan > Sindh > Karachi Division > Karachi (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.49)
Research Report > Experimental Study (0.49)

Industry: Law Enforcement & Public Safety > Fire & Emergency Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.70)

Add feedback

Bags of Projected Nearest Neighbours: Competitors to Random Forests?

Hofmeyr, David P.

arXiv.org Machine LearningMar-12-2025

In this paper we introduce a simple and intuitive adaptive k nearest neighbours classifier, and explore its utility within the context of bootstrap aggregating ("bagging"). The approach is based on finding discriminant subspaces which are computationally efficient to compute, and are motivated by enhancing the discrimination of classes through nearest neighbour classifiers. This adaptiveness promotes diversity of the individual classifiers fit across different bootstrap samples, and so further leverages the variance reducing effect of bagging. Extensive experimental results are presented documenting the strong performance of the proposed approach in comparison with Random Forest classifiers, as well as other nearest neighbours based ensembles from the literature, plus other relevant benchmarks. Code to implement the proposed approach is available in the form of an R package from https://github.com/DavidHofmeyr/BOPNN.

discriminant subspace, ensemble, subspace, (17 more...)

arXiv.org Machine Learning

2503.09651

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.34)

Add feedback

Generalizable Machine Learning Models for Predicting Data Center Server Power, Efficiency, and Throughput

Lei, Nuoa, Shehabi, Arman, Lu, Jun, Cao, Zhi, Koomey, Jonathan, Smith, Sarah, Masanet, Eric

arXiv.org Artificial IntelligenceMar-8-2025

In the rapidly evolving digital era, comprehending the intricate dynamics influencing server power consumption, efficiency, and performance is crucial for sustainable data center operations. However, existing models lack the ability to provide a detailed and reliable understanding of these intricate relationships. This study employs a machine learning-based approach, using the SPECPower_ssj2008 database, to facilitate user-friendly and generalizable server modeling. The resulting models demonstrate high accuracy, with errors falling within approximately 10% on the testing dataset, showcasing their practical utility and generalizability. Through meticulous analysis, predictive features related to hardware availability date, server workload level, and specifications are identified, providing insights into optimizing energy conservation, efficiency, and performance in server deployment and operation. By systematically measuring biases and uncertainties, the study underscores the need for caution when employing historical data for prospective server modeling, considering the dynamic nature of technology landscapes. Collectively, this work offers valuable insights into the sustainable deployment and operation of servers in data centers, paving the way for enhanced resource use efficiency and more environmentally conscious practices.

consumption, power consumption, server, (15 more...)

arXiv.org Artificial Intelligence

2503.06439

Country:

North America > United States > California > Santa Barbara County > Santa Barbara (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Information Technology > Services (0.95)
Government (0.93)
Energy > Power Industry (0.69)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Enhancing the Product Quality of the Injection Process Using eXplainable Artificial Intelligence

Hong, Jisoo, Hong, Yongmin, Baek, Jung-Woo, Kang, Sung-Woo

arXiv.org Artificial IntelligenceMar-4-2025

The injection molding process is a traditional technique for making products in various industries such as electronics and automobiles via solidifying liquid resin into certain molds. Although the process is not related to creating the main part of engines or semiconductors, this manufacturing methodology sets the final form of the products. Re-cently, research has continued to reduce the defect rate of the injection molding process. This study proposes an optimal injection molding process control system to reduce the defect rate of injection molding products with XAI (eXplainable Artificial Intelligence) ap-proaches. Boosting algorithms (XGBoost and LightGBM) are used as tree-based classifiers for predicting whether each product is normal or defective. The main features to control the process for improving the product are extracted by SHapley Additive exPlanations, while the individual conditional expectation analyzes the optimal control range of these extracted features. To validate the methodology presented in this work, the actual injection molding AI manufacturing dataset provided by KAMP (Korea AI Manufacturing Platform) is employed for the case study. The results reveal that the defect rate decreases from 1.00% (Original defect rate) to 0.21% with XGBoost and 0.13% with LightGBM, respectively.

injection molding process, main feature, molding process, (11 more...)

arXiv.org Artificial Intelligence

2503.02338

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > Poland (0.04)
Asia > Taiwan (0.04)
(9 more...)

Genre: Research Report (0.83)

Industry:

Transportation (0.55)
Energy (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.92)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.86)

Add feedback

Real-Time Detection of Robot Failures Using Gaze Dynamics in Collaborative Tasks

Tabatabaei, Ramtin, Kostakos, Vassilis, Johal, Wafa

arXiv.org Artificial IntelligenceFeb-27-2025

Detecting robot failures during collaborative tasks is crucial for maintaining trust in human-robot interactions. This study investigates user gaze behaviour as an indicator of robot failures, utilising machine learning models to distinguish between non-failure and two types of failures: executional and decisional. Eye-tracking data were collected from 26 participants collaborating with a robot on Tangram puzzle-solving tasks. Gaze metrics, such as average gaze shift rates and the probability of gazing at specific areas of interest, were used to train machine learning classifiers, including Random Forest, AdaBoost, XGBoost, SVM, and CatBoost. The results show that Random Forest achieved 90% accuracy for detecting executional failures and 80% for decisional failures using the first 5 seconds of failure data. Real-time failure detection was evaluated by segmenting gaze data into intervals of 3, 5, and 10 seconds. These findings highlight the potential of gaze dynamics for real-time error detection in human-robot collaboration.

computing machinery, international conference, robot, (14 more...)

arXiv.org Artificial Intelligence

2503.07622

Country:

North America > United States (0.07)
Oceania > Australia > Victoria > Melbourne (0.05)
Europe > Italy (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.55)

Add feedback

A Method for Evaluating the Interpretability of Machine Learning Models in Predicting Bond Default Risk Based on LIME and SHAP

Zhang, Yan, Chen, Lin, Tian, Yixiang

arXiv.org Artificial IntelligenceFeb-26-2025

Interpretability analysis methods for artificial intelligence models, such as LIME and SHAP, are widely used, though they primarily serve as post-model for analyzing model outputs. While it is commonly believed that the transparency and interpretability of AI models diminish as their complexity increases, currently there is no standardized method for assessing the inherent interpretability of the models themselves. This paper uses bond market default prediction as a case study, applying commonly used machine learning algorithms within AI models. First, the classification performance of these algorithms in default prediction is evaluated. Then, leveraging LIME and SHAP to assess the contribution of sample features to prediction outcomes, the paper proposes a novel method for evaluating the interpretability of the models themselves. The results of this analysis are consistent with the intuitive understanding and logical expectations regarding the interpretability of these models.

artificial intelligence, interpretability, prediction, (12 more...)

arXiv.org Artificial Intelligence

2502.19615

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)
Europe > Switzerland (0.04)
Asia > China (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry:

Banking & Finance > Credit (0.67)
Banking & Finance > Risk Management (0.52)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.34)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.33)

Add feedback

Mitigating Attrition: Data-Driven Approach Using Machine Learning and Data Engineering

Vijayan, Naveen Edapurath

arXiv.org Artificial IntelligenceFeb-25-2025

This paper presents a novel data-driven approach to mitigating employee attrition using machine learning and data engineering techniques. The proposed framework integrates data from various human resources systems and leverages advanced feature engineering to capture a comprehensive set of factors influencing attrition. The study outlines a robust modeling approach that addresses challenges such as imbalanced datasets, categorical data handling, and model interpretation. The methodology includes careful consideration of training and testing strategies, baseline model establishment, and the development of calibrated predictive models. The research emphasizes the importance of model interpretation using techniques like SHAP values to provide actionable insights for organizations. Key design choices in algorithm selection, hyperparameter tuning, and probability calibration are discussed. This approach enables organizations to proactively identify attrition risks and develop targeted retention strategies, ultimately redu

attrition, employee attrition, probability, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.36948/ijfmr.2019.v01i01.6147

2502.17865

Country: North America > United States > Washington > King County > Seattle (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

XGBoost-Based Prediction of ICU Mortality in Sepsis-Associated Acute Kidney Injury Patients Using MIMIC-IV Database with Validation from eICU Database

Chen, Shuheng, Fan, Junyi, Pishgar, Elham, Alaei, Kamiar, Placencia, Greg, Pishgar, Maryam

arXiv.org Artificial IntelligenceFeb-25-2025

Background: Sepsis-Associated Acute Kidney Injury (SA-AKI) leads to high mortality in intensive care. This study develops machine learning models using the Medical Information Mart for Intensive Care IV (MIMIC-IV) database to predict Intensive Care Unit (ICU) mortality in SA-AKI patients. External validation is conducted using the eICU Collaborative Research Database. Methods: For 9,474 identified SA-AKI patients in MIMIC-IV, key features like lab results, vital signs, and comorbidities were selected using Variance Inflation Factor (VIF), Recursive Feature Elimination (RFE), and expert input, narrowing to 24 predictive variables. An Extreme Gradient Boosting (XGBoost) model was built for in-hospital mortality prediction, with hyperparameters optimized using GridSearch. Model interpretability was enhanced with SHapley Additive exPlanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME). External validation was conducted using the eICU database. Results: The proposed XGBoost model achieved an internal Area Under the Receiver Operating Characteristic curve (AUROC) of 0.878 (95% Confidence Interval: 0.859-0.897). SHAP identified Sequential Organ Failure Assessment (SOFA), serum lactate, and respiratory rate as key mortality predictors. LIME highlighted serum lactate, Acute Physiology and Chronic Health Evaluation II (APACHE II) score, total urine output, and serum calcium as critical features. Conclusions: The integration of advanced techniques with the XGBoost algorithm yielded a highly accurate and interpretable model for predicting SA-AKI mortality across diverse populations. It supports early identification of high-risk patients, enhancing clinical decision-making in intensive care. Future work needs to focus on enhancing adaptability, versatility, and real-world applications.

dataset, mortality, prediction, (13 more...)

arXiv.org Artificial Intelligence

2502.17978

Country:

Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
South America > Uruguay > Maldonado > Maldonado (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Nephrology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)

Add feedback

Integrating Boosted learning with Differential Evolution (DE) Optimizer: A Prediction of Groundwater Quality Risk Assessment in Odisha

Subudhi, Sonalika, Pati, Alok Kumar, Bose, Sephali, Sahoo, Subhasmita, Pattanaik, Avipsa, Acharya, Biswa Mohan

arXiv.org Artificial IntelligenceFeb-25-2025

Groundwater is eventually undermined by human exercises, such as fast industrialization, urbanization, over-extraction, and contamination from agrarian and urban sources. From among the different contaminants, the presence of heavy metals like cadmium (Cd), chromium (Cr), arsenic (As), and lead (Pb) proves to have serious dangers when present in huge concentrations in groundwater. Long-term usage of these poisonous components may lead to neurological disorders, kidney failure and different sorts of cancer. To address these issues, this study developed a machine learning-based predictive model to evaluate the Groundwater Quality Index (GWQI) and identify the main contaminants which are affecting the water quality. It has been achieved with the help of a hybrid machine learning model i.e. LCBoost Fusion . The model has undergone several processes like data preprocessing, hyperparameter tuning using Differential Evolution (DE) optimization, and evaluation through cross-validation. The LCBoost Fusion model outperforms individual models (CatBoost and LightGBM), by achieving low RMSE (0.6829), MSE (0.5102), MAE (0.3147) and a high R$^2$ score of 0.9809. Feature importance analysis highlights Potassium (K), Fluoride (F) and Total Hardness (TH) as the most influential indicators of groundwater contamination. This research successfully demonstrates the application of machine learning in assessing groundwater quality risks in Odisha. The proposed LCBoost Fusion model offers a reliable and efficient approach for real-time groundwater monitoring and risk mitigation. These findings will help the environmental organizations and the policy makers to map out targeted places for sustainable groundwater management. Future work will focus on using remote sensing data and developing an interactive decision-making system for groundwater quality assessment.

lcboost fusion, prediction, quality indicator, (13 more...)

arXiv.org Artificial Intelligence

2502.17929

Country:

North America > United States (0.69)
Africa > South Africa (0.04)
Oceania > New Zealand (0.04)
(14 more...)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.88)

Industry:

Water & Waste Management > Water Management > Water Supplies & Services (1.00)
Law (1.00)
Health & Medicine (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.67)

Add feedback