AITopics

2309.14307

Country: South America > Brazil > Pernambuco (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.96)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

arXiv.org Artificial IntelligenceDec-23-2022

The choice of scaling technique matters for classification performance

de Amorim, Lucas B. V., Cavalcanti, George D. C., Cruz, Rafael M. O.

Dataset scaling, also known as normalization, is an essential preprocessing step in a machine learning pipeline. It is aimed at adjusting attributes scales in a way that they all vary within the same range. This transformation is known to improve the performance of classification models, but there are several scaling techniques to choose from, and this choice is not generally done carefully. In this paper, we execute a broad experiment comparing the impact of 5 scaling techniques on the performances of 20 classification algorithms among monolithic and ensemble models, applying them to 82 publicly available datasets with varying imbalance ratios. Results show that the choice of scaling technique matters for classification performance, and the performance difference between the best and the worst scaling technique is relevant and statistically significant in most cases. They also indicate that choosing an inadequate technique can be more detrimental to classification performance than not scaling the data at all. We also show how the performance variation of an ensemble model, considering different scaling techniques, tends to be dictated by that of its base model. Finally, we discuss the relationship between a model's sensitivity to the choice of scaling technique and its performance and provide insights into its applicability on different model deployment scenarios. Full results and source code for the experiments in this paper are available in a GitHub repository.\footnote{https://github.com/amorimlb/scaling\_matters}

artificial intelligence, dataset, machine learning, (19 more...)

doi: 10.1016/j.asoc.2022.109924

2212.12343

Country: North America > United States (0.67)

Genre: Research Report > New Finding (1.00)

Industry:

Education (0.67)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(3 more...)

arXiv.org Artificial IntelligenceSep-13-2021

Impact of lung segmentation on the diagnosis and explanation of COVID-19 in chest X-ray images

Teixeira, Lucas O., Pereira, Rodolfo M., Bertolini, Diego, Oliveira, Luiz S., Nanni, Loris, Cavalcanti, George D. C., Costa, Yandre M. G.

COVID-19 frequently provokes pneumonia, which can be diagnosed using imaging exams. Chest X-ray (CXR) is often useful because it is cheap, fast, widespread, and uses less radiation. Here, we demonstrate the impact of lung segmentation in COVID-19 identification using CXR images and evaluate which contents of the image influenced the most. Semantic segmentation was performed using a U-Net CNN architecture, and the classification using three CNN architectures (VGG, ResNet, and Inception). Explainable Artificial Intelligence techniques were employed to estimate the impact of segmentation. A three-classes database was composed: lung opacity (pneumonia), COVID-19, and normal. We assessed the impact of creating a CXR image database from different sources, and the COVID-19 generalization from one source to another. The segmentation achieved a Jaccard distance of 0.034 and a Dice coefficient of 0.982. The classification using segmented images achieved an F1-Score of 0.88 for the multi-class setup, and 0.83 for COVID-19 identification. In the cross-dataset scenario, we obtained an F1-Score of 0.74 and an area under the ROC curve of 0.9 for COVID-19 identification using segmented images. Experiments support the conclusion that even after segmentation, there is a strong bias introduced by underlying factors from different sources.

artificial intelligence, machine learning, natural language, (24 more...)

2009.0978

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.66)

arXiv.org Machine LearningApr-9-2019

Evaluating Competence Measures for Dynamic Regressor Selection

Moura, Thiago J. M., Cavalcanti, George D. C., Oliveira, Luiz S.

Dynamic regressor selection (DRS) systems work by selecting the most competent regressors from an ensemble to estimate the target value of a given test pattern. This competence is usually quantified using the performance of the regressors in local regions of the feature space around the test pattern. However, choosing the best measure to calculate the level of competence correctly is not straightforward. The literature of dynamic classifier selection presents a wide variety of competence measures, which cannot be used or adapted for DRS. In this paper, we review eight measures used with regression problems, and adapt them to test the performance of the DRS algorithms found in the literature. Such measures are extracted from a local region of the feature space around the test pattern, called region of competence, therefore competence measures.To better compare the competence measures, we perform a set of comprehensive experiments of 15 regression datasets. Three DRS systems were compared against individual regressor and static systems that use the Mean and the Median to combine the outputs of the regressors from the ensemble. The DRS systems were assessed varying the competence measures. Our results show that DRS systems outperform individual regressors and static systems but the choice of the competence measure is problem-dependent.

artificial intelligence, regressor, survey article, (18 more...)

1904.04645

Country: South America > Brazil (0.68)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

arXiv.org Machine LearningNov-28-2018

ICPRAI 2018 SI: On dynamic ensemble selection and data preprocessing for multi-class imbalance learning

Cruz, Rafael M. O., Souza, Mariana A., Sabourin, Robert, Cavalcanti, George D. C.

Class-imbalance refers to classification problems in which many more instances are available for certain classes than for others. Such imbalanced datasets require special attention because traditional classifiers generally favor the majority class which has a large number of instances. Ensemble of classifiers have been reported to yield promising results. However, the majority of ensemble methods applied to imbalanced learning are static ones. Moreover, they only deal with binary imbalanced problems. Hence, this paper presents an empirical analysis of dynamic selection techniques and data preprocessing methods for dealing with multi-class imbalanced problems. We considered five variations of preprocessing methods and fourteen dynamic selection schemes. Our experiments conducted on 26 multi-class imbalanced problems show that the dynamic ensemble improves the AUC and the G-mean as compared to the static ensemble. Moreover, data preprocessing plays an important role in such cases.

classifier, health & medicine, survey article, (16 more...)

1811.10481

Country:

North America > Canada (0.28)
South America > Brazil > Pernambuco (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

arXiv.org Machine LearningNov-1-2018

META-DES.Oracle: Meta-learning and feature selection for ensemble selection

Cruz, Rafael M. O, Sabourin, Robert, Cavalcanti, George D. C.

The key issue in Dynamic Ensemble Selection (DES) is defining a suitable criterion for calculating the classifiers' competence. There are several criteria available to measure the level of competence of base classifiers, such as local accuracy estimates and ranking. However, using only one criterion may lead to a poor estimation of the classifier's competence. In order to deal with this issue, we have proposed a novel dynamic ensemble selection framework using meta-learning, called META-DES. An important aspect of the META-DES framework is that multiple criteria can be embedded in the system encoded as different sets of meta-features. However, some DES criteria are not suitable for every classification problem. For instance, local accuracy estimates may produce poor results when there is a high degree of overlap between the classes. Moreover, a higher classification accuracy can be obtained if the performance of the meta-classifier is optimized for the corresponding data. In this paper, we propose a novel version of the META-DES framework based on the formal definition of the Oracle, called META-DES.Oracle. The Oracle is an abstract method that represents an ideal classifier selection scheme. A meta-feature selection scheme using an overfitting cautious Binary Particle Swarm Optimization (BPSO) is proposed for improving the performance of the meta-classifier. The difference between the outputs obtained by the meta-classifier and those presented by the Oracle is minimized. Thus, the meta-classifier is expected to obtain results that are similar to the Oracle. Experiments carried out using 30 classification problems demonstrate that the optimization procedure based on the Oracle definition leads to a significant improvement in classification accuracy when compared to previous versions of the META-DES framework and other state-of-the-art DES techniques.

artificial intelligence, classifier, evolutionary algorithm, (18 more...)

doi: 10.1016/j.inffus.2017.02.010

1811.00217

Country:

North America > Canada > Quebec (0.14)
South America > Brazil > Pernambuco (0.14)
North America > United States > New Jersey > Hudson County (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

arXiv.org Artificial IntelligenceNov-1-2018

On Meta-Learning for Dynamic Ensemble Selection

Cruz, Rafael M. O., Sabourin, Robert, Cavalcanti, George D. C.

In this paper, we propose a novel dynamic ensemble selection framework using meta-learning. The framework is divided into three steps. In the first step, the pool of classifiers is generated from the training data. The second phase is responsible to extract the meta-features and train the meta-classifier. Five distinct sets of meta-features are proposed, each one corresponding to a different criterion to measure the level of competence of a classifier for the classification of a given query sample. The meta-features are computed using the training data and used to train a meta-classifier that is able to predict whether or not a base classifier from the pool is competent enough to classify an input instance. Three different training scenarios for the training of the meta-classifier are considered: problem-dependent, problem-independent and hybrid. Experimental results show that the problem-dependent scenario provides the best result. In addition, the performance of the problem-dependent scenario is strongly correlated with the recognition rate of the system. A comparison with state-of-the-art techniques shows that the proposed-dependent approach outperforms current dynamic ensemble selection techniques.

artificial intelligence, classifier, machine learning, (16 more...)

doi: 10.1109/ICPR.2014.221

1811.01743

Country: North America > Canada > Quebec (0.14)

Genre: Research Report > New Finding (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

arXiv.org Machine LearningNov-1-2018

Analyzing different prototype selection techniques for dynamic classifier and ensemble selection

Cruz, Rafael M. O., Sabourin, Robert, Cavalcanti, George D. C.

Abstract--In dynamic selection (DS) techniques, only the most competent classifiers, for the classification of a specific test sample are selected to predict the sample's class labels. The more important step in DES techniques is estimating the competence of the base classifiers for the classification of each specific test sample. The classifiers' competence is usually estimated using the neighborhood of the test sample defined on the validation samples, called the region of competence. Thus, the performance of DS techniques is sensitive to the distribution of the validation set. In this paper, we evaluate six prototype selection techniques that work by editing the validation data in order to remove noise and redundant instances. Experiments conducted using several state-of-the-art DS techniques over 30 classification problems demonstrate that by using prototype selection techniques we can improve the classification accuracy of DS techniques and also significantly reduce the computational cost involved. Multiple Classifier Systems (MCS) aim to combine classifiers in order to increase the recognition accuracy in pattern recognition systems [1], [2]. MCS are composed of three phases [3]: (1) Generation, (2) Selection, and (3) Integration.

artificial intelligence, health & medicine, selection technique, (17 more...)

doi: 10.1109/IJCNN.2017.7966355

1811.00677

Country:

North America > United States (0.28)
North America > Canada > Quebec (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry: Health & Medicine (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

arXiv.org Machine LearningNov-1-2018

A Method For Dynamic Ensemble Selection Based on a Filter and an Adaptive Distance to Improve the Quality of the Regions of Competence

Cruz, Rafael M. O., Cavalcanti, George D. C., Ren, Tsang Ing

Abstract-- Dynamic classifier selection systems aim to select a group of classifiers that is most adequate for a specific query pattern. This is done by defining a region around the query pattern and analyzing the competence of the classifiers in this region. However, the regions are often surrounded by noise which can difficult the classifier selection. This fact makes the performance of most dynamic selection systems no better than static selections. In this paper we demonstrate that the performance of dynamic selection systems end up limited by the quality of the regions extracted. Thereafter, we propose a new dynamic classifier selection system that improves the regions of competence in order to achieve higher recognition rates. Results obtained from several classification databases show the proposed method not only significantly increase the recognition performance, but also decreases the computational cost. Multiple Classifier Systems/Ensemble of Classifiers have been widely studied in the past years as an alternative to increase efficiency and accuracy in pattern recognition problems [1], [2].

artificial intelligence, classifier, machine learning, (17 more...)

doi: 10.1109/IJCNN.2011.6033350

1811.00669

Country: South America > Brazil > Pernambuco (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.98)

arXiv.org Artificial IntelligenceNov-1-2018

META-DES.H: a dynamic ensemble selection technique using meta-learning and a dynamic weighting approach

Cruz, Rafael M. O., Sabourin, Robert, Cavalcanti, George D. C.

In Dynamic Ensemble Selection (DES) techniques, only the most competent classifiers are selected to classify a given query sample. Hence, the key issue in DES is how to estimate the competence of each classifier in a pool to select the most competent ones. In order to deal with this issue, we proposed a novel dynamic ensemble selection framework using meta-learning, called META-DES. The framework is divided into three steps. In the first step, the pool of classifiers is generated from the training data. In the second phase the meta-features are computed using the training data and used to train a meta-classifier that is able to predict whether or not a base classifier from the pool is competent enough to classify an input instance. In this paper, we propose improvements to the training and generalization phase of the META-DES framework. In the training phase, we evaluate four different algorithms for the training of the meta-classifier. For the generalization phase, three combination approaches are evaluated: Dynamic selection, where only the classifiers that attain a certain competence level are selected; Dynamic weighting, where the meta-classifier estimates the competence of each classifier in the pool, and the outputs of all classifiers in the pool are weighted based on their level of competence; and a hybrid approach, in which first an ensemble with the most competent classifiers is selected, after which the weights of the selected classifiers are estimated in order to be used in a weighted majority voting scheme. Experiments are carried out on 30 classification datasets. Experimental results demonstrate that the changes proposed in this paper significantly improve the recognition accuracy of the system in several datasets.

artificial intelligence, classifier, neural network, (19 more...)

1811.01742

Country: North America > Canada > Quebec (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.31)