AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

AI-Driven anemia diagnosis: A review of advanced models and techniques

Mahmud, Abdullah Al, Chowdhury, Prangon, Uddin, Mohammed Borhan, Delowar, Khaled Eabne, Talha, Tausifur Rahman, Dewanjee, Bijoy

arXiv.org Artificial IntelligenceOct-14-2025

Anemia, a condition marked by insufficient levels of red blood cells or hemoglobin, remains a widespread health issue affecting millions of individuals globally. Accurate and timely diagnosis is essential for effective management and treatment of anemia. In recent years, there has been a growing interest in the use of artificial intelligence techniques, i.e., machine learning (ML) and deep learning (DL) for the detection, classification, and diagnosis of anemia. This paper provides a systematic review of the recent advancements in this field, with a focus on various models applied to anemia detection. The review also compares these models based on several performance metrics, including accuracy, sensitivity, specificity, and precision. By analyzing these metrics, the paper evaluates the strengths and limitation of discussed models in detecting and classifying anemia, emphasizing the importance of addressing these factors to improve diagnostic accuracy.

artificial intelligence, fuzzy logic, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2510.1138

Country:

Africa (0.93)
Asia > India (0.28)
North America > United States (0.28)
Asia > Middle East (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Hematology (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Add feedback

Detecting Malicious Pilot Contamination in Multiuser Massive MIMO Using Decision Trees

da Cruz, Pedro Ivo, Silva, Dimitri, Spadini, Tito, Suyama, Ricardo, Loiola, Murilo Bellezoni

arXiv.org Artificial IntelligenceOct-14-2025

Massive multiple-input multiple-output (MMIMO) is essential to modern wireless communication systems, like 5G and 6G, but it is vulnerable to active eavesdropping attacks. One type of such attack is the pilot contamination attack (PCA), where a malicious user copies pilot signals from an authentic user during uplink, intentionally interfering with the base station's (BS) channel estimation accuracy. In this work, we propose to use a Decision Tree (DT) algorithm for PCA detection at the BS in a multi-user system. We present a methodology to generate training data for the DT classifier and select the best DT according to their depth. Then, we simulate different scenarios that could be encountered in practice and compare the DT to a classical technique based on likelihood ratio testing (LRT) submitted to the same scenarios. The results revealed that a DT with only one level of depth is sufficient to outperform the LRT. The DT shows a good performance regarding the probability of detection in noisy scenarios and when the malicious user transmits with low power, in which case the LRT fails to detect the PCA. We also show that the reason for the good performance of the DT is its ability to compute a threshold that separates PCA data from non-PCA data better than the LRT's threshold. Moreover, the DT does not necessitate prior knowledge of noise power or assumptions regarding the signal power of malicious users, prerequisites typically essential for LRT and other hypothesis testing methodologies.

artificial intelligence, detection, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s11235-024-01163-0

2510.03831

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (0.93)
Telecommunications (0.66)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.84)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.70)
(2 more...)

Add feedback

"A 6 or a 9?": Ensemble Learning Through the Multiplicity of Performant Models and Explanations

Zuin, Gianlucca, Veloso, Adriano

arXiv.org Artificial IntelligenceOct-14-2025

Creating models from past observations and ensuring their effectiveness on new data is the essence of machine learning. However, selecting models that generalize well remains a challenging task. Related to this topic, the Rashomon Effect refers to cases where multiple models perform similarly well for a given learning problem. This often occurs in real-world scenarios, like the manufacturing process or medical diagnosis, where diverse patterns in data lead to multiple high-performing solutions. We propose the Rashomon Ensemble, a method that strategically selects models from these diverse high-performing solutions to improve generalization. By grouping models based on both their performance and explanations, we construct ensembles that maximize diversity while maintaining predictive accuracy. This selection ensures that each model covers a distinct region of the solution space, making the ensemble more robust to distribution shifts and variations in unseen data. We validate our approach on both open and proprietary collaborative real-world datasets, demonstrating up to 0.20+ AUROC improvements in scenarios where the Rashomon ratio is large. Additionally, we demonstrate tangible benefits for businesses in various real-world applications, highlighting the robustness, practicality, and effectiveness of our approach.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3767735

2509.09073

Country:

South America > Brazil (1.00)
Europe (1.00)
North America > United States > California (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Promising Solution (0.67)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(5 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.93)
(2 more...)

Add feedback

Transfer Learning with Distance Covariance for Random Forest: Error Bounds and an EHR Application

Li, Chenze, Paul, Subhadeep

arXiv.org Machine LearningOct-14-2025

Random forest is an important method for ML applications due to its broad outperformance over competing methods for structured tabular data. We propose a method for transfer learning in nonparametric regression using a centered random forest (CRF) with distance covariance-based feature weights, assuming the unknown source and target regression functions are different for a few features (sparsely different). Our method first obtains residuals from predicting the response in the target domain using a source domain-trained CRF. Then, we fit another CRF to the residuals, but with feature splitting probabilities proportional to the sample distance covariance between the features and the residuals in an independent sample. We derive an upper bound on the mean square error rate of the procedure as a function of sample sizes and difference dimension, theoretically demonstrating transfer learning benefits in random forests. In simulations, we show that the results obtained for the CRFs also hold numerically for the standard random forest (SRF) method with data-driven feature split selection. Beyond transfer learning, our results also show the benefit of distance-covariance-based weights on the performance of RF in some situations. Our method shows significant gains in predicting the mortality of ICU patients in smaller-bed target hospitals using a large multi-hospital dataset of electronic health records for 200,000 ICU patients.

artificial intelligence, machine learning, random forest, (18 more...)

arXiv.org Machine Learning

2510.1087

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.66)

Industry: