AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

Provably robust boosted decision stumps and trees against adversarial attacks

Maksym Andriushchenko, Matthias Hein

Neural Information Processing SystemsOct-2-2025, 15:11:43 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, robustness, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.46)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (0.50)
Health & Medicine (0.47)
Government > Military (0.41)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Supplementary Material for Classification with Valid and Adaptive Coverage Y aniv Romano

Neural Information Processing SystemsOct-2-2025, 11:52:56 GMT

Here, we consider the jackknife+--i.e., Algorithm S1 describes the extension of Algorithm 1 discussed in Section 2.5, which ensures The validity of this algorithm is established by the following result. We begin by proving the lower bound on coverage. This will become apparent after we reduce our claim to the setting in the aforementioned paper. This is easy to verify. Let σ (1),...,σ ( n + m) be the permutation of the data points corresponding to Σ, so that (ΣA Σ S3.1 Implementation details We have applied the following black-box classification methods to estimate label probabilities: JK+ is omitted for computational reasons. The performances of the different methods on data generated from this model are compared in Figure S3.

artificial intelligence, experiment, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California > Los Angeles County > Los Angeles (0.28)

Technology:

Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

1b9a80606d74d3da6db2f1274557e644-Paper.pdf

Neural Information Processing SystemsOct-2-2025, 06:48:02 GMT

artificial intelligence, ensemble, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California > San Francisco County > San Francisco (0.15)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.73)

Add feedback

1373b284bc381890049e92d324f56de0-Supplemental.pdf

Neural Information Processing SystemsOct-2-2025, 03:51:36 GMT

artificial intelligence, machine learning, optimization problem, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.93)

Industry: Health & Medicine > Therapeutic Area (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)

Add feedback

1373b284bc381890049e92d324f56de0-Paper.pdf

Neural Information Processing SystemsOct-2-2025, 03:51:29 GMT

artificial intelligence, machine learning, optimization problem, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)

Add feedback

Efficient Non-greedy Optimization of Decision Trees

Mohammad Norouzi, Maxwell Collins, Matthew A. Johnson, David J. Fleet, Pushmeet Kohli

Neural Information Processing SystemsOct-2-2025, 00:48:22 GMT

Decision trees and randomized forests are widely used in computer vision and machine learning.

decision tree, sgn, split function, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > Austria > Salzburg > Salzburg (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.31)

Add feedback

Interpretable Machine Learning for Life Expectancy Prediction: A Comparative Study of Linear Regression, Decision Tree, and Random Forest

Dolgopolyi, Roman, Amaslidou, Ioanna, Margaritou, Agrippina

arXiv.org Artificial IntelligenceOct-2-2025

Life expectancy is a fundamental indicator of population health and socio-economic well-being, yet accurately forecasting it remains challenging due to the interplay of demographic, environmental, and healthcare factors. Thi s study evaluates three machine learning models--Linear Regression (LR), Regression Decision Tree (RDT), and Random Forest (RF), using a real -world da-taset drawn from World Health Organization (WHO) and United N ations (UN) sources. After extensive preprocessing to address missing v alues and inconsistencies, each model's performance was assessed with R, Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE). Results show tha t RF achieves the highest predictive accuracy (R = 0.9423), significantly outperforming LR and RDT. Interpretability was prioritized through p -values for LR and feature - importance metrics for the tree -based models, revealing immunization rates (diphtheria, measles) and demographic attributes (HIV/AIDS, adult mortality) as critical drivers of life-expectancy predictions. These insights underscore the synergy between ensemble methods and transparency in addressing public -health challenges. Future research should explore advanced imputation strategies, alternative algorithms (e.g., neural networks), and updated data to further refine predictive accuracy and support evidence-based policymaking in global health contexts.

artificial intelligence, life expectancy, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.00542

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.55)

Add feedback

Mondrian Forests: Efficient Online Random Forests

Neural Information Processing SystemsSep-30-2025, 08:34:20 GMT

Ensembles of randomized decision trees, usually referred to as random forests, are widely used for classification and regression tasks in machine learning and statistics. Random forests achieve competitive predictive performance and are computationally efficient to train and test, making them excellent candidates for real-world prediction tasks. The most popular random forest variants (such as Breiman's random forest and extremely randomized trees) operate on batches of training data. Online methods are now in greater demand. Existing online random forests, however, require more training data than their batch counterpart to achieve comparable predictive performance. In this work, we use Mondrian processes (Roy and Teh, 2009) to construct ensembles of random decision trees we call Mondrian forests. Mondrian forests can be grown in an incremental/online fashion and remarkably, the distribution of online Mondrian forests is the same as that of batch Mondrian forests. Mondrian forests achieve competitive predictive performance comparable with existing online random forests and periodically re-trained batch random forests, while being more than an order of magnitude faster, thus representing a better computation vs accuracy tradeoff.

efficient online random forest, mondrian forest, name change, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Localized Uncertainty Quantification in Random Forests via Proximities

Rhodes, Jake S., Brown, Scott D., Wilkinson, J. Riley

arXiv.org Machine LearningSep-30-2025

Abstract--In machine learning, uncertainty quantification helps assess the reliability of model predictions, which is important in high-stakes scenarios. Traditional approaches often emphasize predictive accuracy, but there is a growing focus on incorporating uncertainty measures. While current methods often rely on quantile regression or Monte Carlo techniques, we propose a new approach using naturally occurring test sets and similarity measures (proximities) typically viewed as byproducts of random forests. Specifically, we form localized distributions of OOB errors around nearby points, defined using the proximities, to create prediction intervals for regression and trust scores for classification. By varying the number of nearby points, our intervals can be adjusted to achieve the desired coverage while retaining the flexibility that reflects the certainty of individual predictions. For classification, excluding points identified as unclassifiable by our method generally enhances the accuracy of the model and provides higher accuracy-rejection AUC scores than competing methods. Although traditional machine learning models usually provide point estimates, there is growing recognition of the need to incorporate uncertainty to support more informed decisions [1]. By quantifying uncertainty, users can assess the reliability of model outputs and better interpret results, especially for out-of-distribution samples through calibrated confidence estimates.

prediction, prediction interval, proximity, (17 more...)

arXiv.org Machine Learning

2509.22928

Country:

North America > United States > Utah > Utah County > Provo (0.04)
North America > United States > Texas > Brazos County > College Station (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

SHAPoint: Task-Agnostic, Efficient, and Interpretable Point-Based Risk Scoring via Shapley Values

Meirman, Tomer D., Shapira, Bracha, Dagan, Noa, Rokach, Lior S.

arXiv.org Artificial IntelligenceSep-30-2025

Interpretable risk scores play a vital role in clinical decision support, yet traditional methods for deriving such scores often rely on manual preprocessing, task-specific modeling, and simplified assumptions that limit their flexibility and predictive power. We present SHAPoint, a novel, task-agnostic framework that integrates the predictive accuracy of gradient boosted trees with the interpretability of point-based risk scores. SHAPoint supports classification, regression, and survival tasks, while also inheriting valuable properties from tree-based models, such as native handling of missing data and support for monotonic constraints. Compared to existing frameworks, SHAPoint offers superior flexibility, reduced reliance on manual preprocessing, and faster runtime performance. Empirical results show that SHAPoint produces compact and interpretable scores with predictive performance comparable to state-of-the-art methods, but at a fraction of the runtime, making it a powerful tool for transparent and scalable risk stratification.

artificial intelligence, decision tree learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2509.23756

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback