AITopics | Ensemble Learning

Collaborating Authors

Ensemble Learning

Ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Machine Learning-Enabled IoT Security: Open Issues and Challenges Under Advanced Persistent Threats

Chen, Zhiyan, Liu, Jinxin, Shen, Yu, Simsek, Murat, Kantarci, Burak, Mouftah, Hussein T., Djukic, Petar

arXiv.org Artificial IntelligenceApr-16-2022

Despite its technological benefits, Internet of Things (IoT) has cyber weaknesses due to the vulnerabilities in the wireless medium. Machine learning (ML)-based methods are widely used against cyber threats in IoT networks with promising performance. Advanced persistent threat (APT) is prominent for cybercriminals to compromise networks, and it is crucial to long-term and harmful characteristics. However, it is difficult to apply ML-based approaches to identify APT attacks to obtain a promising detection performance due to an extremely small percentage among normal traffic. There are limited surveys to fully investigate APT attacks in IoT networks due to the lack of public datasets with all types of APT attacks. It is worth to bridge the state-of-the-art in network attack detection with APT attack detection in a comprehensive review article. This survey article reviews the security challenges in IoT networks and presents the well-known attacks, APT attacks, and threat models in IoT systems. Meanwhile, signature-based, anomaly-based, and hybrid intrusion detection systems are summarized for IoT networks. The article highlights statistical insights regarding frequently applied ML-based methods against network intrusion alongside the number of attacks types detected. Finally, open issues and challenges for common network intrusion and APT attacks are presented for future research.

artificial intelligence, detection, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3530812

2204.03433

Genre:

Overview (1.00)
Research Report > New Finding (0.46)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Internet of Things (1.00)
Information Technology > Communications > Networks > Sensor Networks (1.00)
(3 more...)

Add feedback

2022 Machine Learning A to Z : 5 Machine Learning Projects

#artificialintelligenceApr-11-2022, 03:48:00 GMT

Evaluation metrics to analyze the performance of models. Different methods to deal with imbalanced data. Implementation of Content and Collaborative based filtering. Implementation of Different algorithms used for Time Series forecasting. Evaluation metrics to analyze the performance of models.

implementation, logistic regression, machine learning, (11 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.82)

Industry: Food & Agriculture > Agriculture (0.52)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.38)

Add feedback

Distributional Gradient Boosting Machines

März, Alexander, Kneib, Thomas

arXiv.org Machine LearningApr-2-2022

We present a unified probabilistic gradient boosting framework for regression tasks that models and predicts the entire conditional distribution of a univariate response variable as a function of covariates. Our likelihood-based approach allows us to either model all conditional moments of a parametric distribution, or to approximate the conditional cumulative distribution function via Normalizing Flows. As underlying computational backbones, our framework is based on XGBoost and LightGBM. Modelling and predicting the entire conditional distribution greatly enhances existing tree-based gradient boosting implementations, as it allows to create probabilistic forecasts from which prediction intervals and quantiles of interest can be derived. Empirical results show that our framework achieves state-of-the-art forecast accuracy.

artificial intelligence, machine learning, normalizing flow, (18 more...)

arXiv.org Machine Learning

2204.00778

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Germany > Lower Saxony > Gottingen (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Italy > Sardinia (0.04)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
(2 more...)

Add feedback

XGBoost Documentation -- Xgboost 1.5.2 Documentation - AI Summary

#artificialintelligenceMar-27-2022, 06:21:29 GMT

Stay updated on last news about Artificial Intelligence. Check your inbox or spam folder to confirm your subscription.

ai summary, documentation, xgboost documentation, (1 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.98)

Add feedback

Ensembles in Machine Learning

#artificialintelligenceMar-22-2022, 18:06:59 GMT

Ensemble methods are well established as an algorithmic cornerstone in machine learning (ML). Just as in real life, in ML a committee of experts will often perform better than an individual provided appropriate care is taken in constituting the committee. Since the earliest days of ML research, a variety of ensemble strategies have been developed with random forests and gradient boosting emerging as leading-edge methods in classification today. It has been recognised since the early days of ML research that ensembles of classifiers can be more accurate than individual models. In ML, ensembles are effectively committees that aggregate the predictions of individual classifiers. They are effective for very much the same reasons a committee of experts works in human decision making, they can bring different expertise to bear and the averaging effect can reduce errors. This article presents a tutorial on the main ensemble methods in use in ML with links to Python notebooks and datasets illustrating these methods in action. The objective is to help practitioners get started with ML ensembles and to provide an insight into when and why ensembles are effective. There have been a lot of developments since then and the ensemble idea is still to the forefront in ML applications. For example, random forests [2] and gradient boosting [7] would be considered among the most powerful methods available to ML practitioners today. The generic ensemble idea is presented in Figure 1. All ensembles are made up of a collection of base classifiers, also known as members or estimators.

diversity, ensemble, estimator, (17 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)

Add feedback

GAM(L)A: An econometric model for interpretable Machine Learning

Flachaire, Emmanuel, Hacheme, Gilles, Hué, Sullivan, Laurent, Sébastien

arXiv.org Machine LearningMar-17-2022

Despite their high predictive performance, random forest and gradient boosting are often considered as black boxes or uninterpretable models which has raised concerns from practitioners and regulators. As an alternative, we propose in this paper to use partial linear models that are inherently interpretable. Specifically, this article introduces GAM-lasso (GAMLA) and GAM-autometrics (GAMA), denoted as GAM(L)A in short. GAM(L)A combines parametric and non-parametric functions to accurately capture linearities and non-linearities prevailing between dependent and explanatory variables, and a variable selection procedure to control for overfitting issues. Estimation relies on a two-step procedure building upon the double residual method. We illustrate the predictive performance and interpretability of GAM(L)A on a regression and a classification problem. The results show that GAM(L)A outperforms parametric models augmented by quadratic, cubic and interaction effects. Moreover, the results also suggest that the performance of GAM(L)A is not significantly different from that of random forest and gradient boosting.

algorithm, gamla, predictive performance, (15 more...)

arXiv.org Machine Learning

2203.11691

Country:

North America > United States (0.14)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Banking & Finance > Credit (0.47)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Evaluating Local Model-Agnostic Explanations of Learning to Rank Models with Decision Paths

Rahnama, Amir Hossein Akhavan, Butepage, Judith

arXiv.org Machine LearningMar-16-2022

Local explanations of learning-to-rank (LTR) models are thought to extract the most important features that contribute to the ranking predicted by the LTR model for a single data point. Evaluating the accuracy of such explanations is challenging since the ground truth feature importance scores are not available for most modern LTR models. In this work, we propose a systematic evaluation technique for explanations of LTR models. Instead of using black-box models, such as neural networks, we propose to focus on tree-based LTR models, from which we can extract the ground truth feature importance scores using decision paths. Once extracted, we can directly compare the ground truth feature importance scores to the feature importance scores generated with explanation techniques. We compare two recently proposed explanation techniques for LTR models when using decision trees and gradient boosting models on the MQ2008 dataset. We show that the explanation accuracy in these techniques can largely vary depending on the explained model and even which data point is explained.

decision path, explanation, explanation accuracy, (12 more...)

arXiv.org Machine Learning

2203.02295

Genre: Research Report > New Finding (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

The Yield Curve as a Recession Leading Indicator. An Application for Gradient Boosting and Random Forest

Delgado, Pedro Cadahia, Congregado, Emilio, Golpe, Antonio A., Vides, José Carlos

arXiv.org Machine LearningMar-13-2022

Most representative decision tree ensemble methods have been used to examine the variable importance of Treasury term spreads to predict US economic recessions with a balance of generating rules for US economic recession detection. A strategy is proposed for training the classifiers with Treasury term spreads data and the results are compared in order to select the best model for interpretability. We also discuss the use of SHapley Additive exPlanations (SHAP) framework to understand US recession forecasts by analyzing feature importance. Consistently with the existing literature we find the most relevant Treasury term spreads for predicting US economic recession and a methodology for detecting relevant rules for economic recession detection. In this case, the most relevant term spread found is 3 month to 6 month, which is proposed to be monitored by economic authorities. Finally, the methodology detected rules with high lift on predicting economic recession that can be used by these entities for this propose. This latter result stands in contrast to a growing body of literature demonstrating that machine learning methods are useful for interpretation comparing many alternative algorithms and we discuss the interpretation for our result and propose further research lines aligned with this work.

economic recession, interest rate, recession, (16 more...)

arXiv.org Machine Learning

doi: 10.9781/ijimai.2022.02.006

2203.06648

Country:

Europe > Spain > Galicia > Madrid (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(3 more...)

Genre: Research Report > New Finding (0.88)

Industry:

Banking & Finance > Economy (1.00)
Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Interpretable machine-learning model with a collaborative game approach to predict yields and higher heating value of torrefied biomass

#artificialintelligenceMar-12-2022

Machine learning models developed to predict energy properties of torrefied biomass. Collaborative game theory adopted to aid interpretability of key variables in torrefaction. Gradient boosting offered the highest prediction accuracy with 22-feature input. Novel framework to explain local and global effects of each feature on torrefaction. Torrefaction is a treatment process for converting biomass to high-quality solid fuels.

artificial intelligence, machine learning, torrefaction, (6 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.45)

Add feedback

Explainable Machine Learning for Predicting Homicide Clearance in the United States

Campedelli, Gian Maria

arXiv.org Machine LearningMar-9-2022

Purpose: To explore the potential of Explainable Machine Learning in the prediction and detection of drivers of cleared homicides at the national- and state-levels in the United States. Methods: First, nine algorithmic approaches are compared to assess the best performance in predicting cleared homicides country-wise, using data from the Murder Accountability Project. The most accurate algorithm among all (XGBoost) is then used for predicting clearance outcomes state-wise. Second, SHAP, a framework for Explainable Artificial Intelligence, is employed to capture the most important features in explaining clearance patterns both at the national and state levels. Results: At the national level, XGBoost demonstrates to achieve the best performance overall. Substantial predictive variability is detected state-wise. In terms of explainability, SHAP highlights the relevance of several features in consistently predicting investigation outcomes. These include homicide circumstances, weapons, victims' sex and race, as well as number of involved offenders and victims. Conclusions: Explainable Machine Learning demonstrates to be a helpful framework for predicting homicide clearance. SHAP outcomes suggest a more organic integration of the two theoretical perspectives emerged in the literature. Furthermore, jurisdictional heterogeneity highlights the importance of developing ad hoc state-level strategies to improve police performance in clearing homicides.

artificial intelligence, homicide, machine learning, (15 more...)

arXiv.org Machine Learning

doi: 10.1016/j.jcrimjus.2022.101898

2203.04768

Country:

North America > United States > Michigan (0.04)
Europe > Portugal > Braga > Braga (0.04)
North America > United States > North Dakota (0.04)
(28 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.88)

Add feedback