Goto

Collaborating Authors

 Decision Tree Learning


Zero-Shot Decision Tree Construction via Large Language Models

arXiv.org Artificial Intelligence

This paper introduces a novel algorithm for constructing decision trees using large language models (LLMs) in a zero-shot manner based on Classification and Regression Trees (CART) principles. Traditional decision tree induction methods rely heavily on labeled data to recursively partition data using criteria such as information gain or the Gini index. In contrast, we propose a method that uses the pre-trained knowledge embedded in LLMs to build decision trees without requiring training data. Our approach leverages LLMs to perform operations essential for decision tree construction, including attribute discretization, probability calculation, and Gini index computation based on the probabilities. We show that these zero-shot decision trees can outperform baseline zero-shot methods and achieve competitive performance compared to supervised data-driven decision trees on tabular datasets. The decision trees constructed via this method provide transparent and interpretable models, addressing data scarcity while preserving interpretability. This work establishes a new baseline in low-data machine learning, offering a principled, knowledge-driven alternative to data-driven tree construction.


Reviews: Optimal Sparse Decision Trees

Neural Information Processing Systems

Originality: Training of optimal decision trees is clearly a problem that has seen a lot of prior work. A distinguishing feature of this submission is that it focuses on optimal *sparse* decision trees for binary variables, and that the approach seems to be feasible in practice, which is achieved by a combination of analytical bounds that reduce the search space as well as efficient implementation techniques. The work builds upon the CORLES algorithm and its approach to creating optimal decision lists. However, the authors extend this approach to decision trees in a non-trivial manner that adds substantial novelty. Quality: The claims of the paper are very well supported by theoretical analysis as well as experiments.


Reviews: Optimal Sparse Decision Trees

Neural Information Processing Systems

Reviewers are very positive about the paper. The contribution is clear and significant. The paper should clearly be accepted. The authors should take into account all reviewers' comments when preparing the final version of their paper, as promised in their response, in particular the improvements suggested by reviewer 1 (as I agree that the paper is heavy on notation and not totally self-contained).


Review for NeurIPS paper: Joints in Random Forests

Neural Information Processing Systems

While the approach is presented as a general generative model based on DT and RF, the paper fails to show its practical interest beyond handling missing values at test time. The possibility of using the approach for outlier detection is potentially interesting but the experiment in the paper is restricted to a single dataset and does not include any comparison with competitors except Gaussian KDE. Overall, the properties of GeDT and GeRF as general purpose density estimators are not really studied. My feeling is that because the tree partitioning is unchanged with respect to standard discriminative DT and RF, GeDT and GeRF are probably only appropriate in the context of tasks related to target predictions. In other tasks, I don't see why they would perform better than pure PC models or other methods mentioned in the related work section.


Review for NeurIPS paper: Joints in Random Forests

Neural Information Processing Systems

Overall, reviewers found the contribution significantly novel: the authors connect two disjoint domains (decision trees and probabilistic circuits), and demonstrated effectiveness of their approach on datasets with missing values. Two main concerns remain, even after the rebuttal (i) it's unclear how the proposed approach has advantages over existing alternatives (ii) the effect of the hyper-parameters remain unclear. Consensus after the discussion period was to accept.


Review for NeurIPS paper: Smooth And Consistent Probabilistic Regression Trees

Neural Information Processing Systems

Strengths: In general, I enjoyed reading this paper. The proposed method capitalizes on employing soft (a.k.a probabilistic) decision trees. However, PDE of each leaf (or region) is computed using non-parametric approach. I liked the idea of directly assigning a probability for the point reaching a particular leaf (region) rather than computing it along the path at each internal node by applying sigmoid like function which is commonly done is soft trees. Moreover, authors did a great job at investigating various aspects of their method.


Review for NeurIPS paper: Universal guarantees for decision tree induction via a higher-order splitting criterion

Neural Information Processing Systems

Summary and Contributions: This paper considers the problem of learning decision trees. You are given samples from a function f on the Boolean cube that is known to be computed by a size s decision tree. The goal is to produce a hypothesis h that is also a small decision tree and is close to f. It was known that simply looking at correlations is not a good idea, simple functions like parity of a few variables would defeat this algorithm. Indeed, I don't think there was any known algorithm that was guaranteed to return a "decision tree" of small size. This paper presents an algorithm of this type.


Review for NeurIPS paper: Universal guarantees for decision tree induction via a higher-order splitting criterion

Neural Information Processing Systems

The three reviews agree that the paper develops strong theoretical results regarding an important topic. Also the techniques are interesting, and the paper is well written. The main negative aspect in the reviews concerns the practical applicability of the results. Although the authors address this in their reply, the reviewers after discussion are not really convinced about the potential for bridging the gap between theory and practice. Regardless of this, the reviewers are clear in their assessment that the work deserves publications purely on the strength of the theoretical contribution.


Review for NeurIPS paper: Towards Convergence Rate Analysis of Random Forests for Classification

Neural Information Processing Systems

Weaknesses: - The studied algorithms remain quite far from real random forests (no bootstrap sampling, split choices are fully independent of the data, trees are pruned, etc.) - As in other results in the literature, convergence rates for forests are by-product of convergence rate of individual trees (using Lemma 1). The results therefore do not really show the benefit of using forests instead of trees in terms of convergence rate. This should be discussed in the paper I think. No real conclusion is drawn from the theoretical results that would help better understand standard RF or suggest modification to these methods. I think this kind of very technical contribution would be more appropriate for a journal submission than for a conference (given the limited time allotted for reviewing).


Review for NeurIPS paper: Towards Convergence Rate Analysis of Random Forests for Classification

Neural Information Processing Systems

The paper provides finite-sample convergence rates for two simplified variants of random forests. Overall, the contribution is purely theoretical. I personally think that this work shed new interesting ideas on the behavior of a learning algorithm that is intensively used world wide. This work clearly deserve a poster acceptation at NeurIPS.