AITopics | Diagnosis

Collaborating Authors

Diagnosis

News Overviews Instructional Materials AI-Alerts Classics

Reviews: Yggdrasil: An Optimized System for Training Deep Decision Trees at Scale

Neural Information Processing SystemsJan-20-2025, 16:46:32 GMT

The paper is well written, and its structure is adapted to the content. Upon reading the paper, one might think that the contribution resides in the vertical splitting of the data over the workers, but the state of the art study presented later on shows that this idea by itself is not new. The novelty comes from associating it with data also distributed vertically, sparse bit vectors for inter-node communications, feature compression with custom data structures and training on compressed data. The paper shows formally and experimentally how the proposed heuristics significantly improve the communication between the nodes and speed up training. The remark that using run-length encoding for the features allows them to hold in the L3 cache, thus decreasing the number of DRAM accesses, doesn't seem to always be true. The paper should explain in which conditions this is true (size of the cache, size of the data, number and type of features, etc.).

optimized system, training deep decision tree, yggdrasil, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.40)

Add feedback

Reviews: A Communication-Efficient Parallel Algorithm for Decision Tree

Neural Information Processing SystemsJan-20-2025, 06:18:01 GMT

Given the popularity of decision trees, proposing an efficient parallel implementation of this method is of course very relevant. The proposed parallelization is original with respect to existing methods and it should indeed lead to less communications than other methods. The theoretical analysis is sound and I like the discussion of the impact of the main problem and method parameters that follows from the lower bound provided in theorem 4.1. Experiments are conducted on two very large problems, where, in the limit of the tested settings (see below), PV-tree is clearly shown to outperform other parallel implementations, in terms of both computing times to reach a given accuracy level and communication costs. I nevertheless have two major concerns with the proposed parallelization.

communication-efficient parallel algorithm, parallelization, training data, (12 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.61)

Add feedback

Learning Causal Models under Independent Changes

Neural Information Processing SystemsJan-20-2025, 01:55:07 GMT

In many scientific applications, we observe a system in different conditions in which its components may change, rather than in isolation. In our work, we are interested in explaining the generating process of such a multi-context system using a finite mixture of causal mechanisms. Recent work shows that this causal model is identifiable from data, but is limited to settings where the sparse mechanism shift hypothesis holds and only a subset of the causal conditionals change. As this assumption is not easily verifiable in practice, we study the more general principle that mechanism shifts are independent, which we formalize using the algorithmic notion of independence. We introduce an approach for causal discovery beyond partially directed graphs using Gaussian Process models, and give conditions under which we provably identify the correct causal model.

independent change, learning causal model

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.40)

Add feedback

Feature Learning for Interpretable, Performant Decision Trees

Neural Information Processing SystemsJan-19-2025, 23:12:41 GMT

Decision trees are regarded for high interpretability arising from their hierarchical partitioning structure built on simple decision rules. However, in practice, this is not realized because axis-aligned partitioning of realistic data results in deep trees, and because ensemble methods are used to mitigate overfitting. Even then, model complexity and performance remain sensitive to transformation of the input, and extensive expert crafting of features from the raw data is common. We propose the first system to alternate sparse feature learning with differentiable decision tree construction to produce small, interpretable trees with good performance. We benchmark against conventional tree-based models and demonstrate several notions of interpretation of a model and its predictions.

feature learning, interpretable, performant decision tree

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.95)

Add feedback

High Precision Causal Model Evaluation with Conditional Randomization

Neural Information Processing SystemsJan-19-2025, 18:11:36 GMT

The gold standard for causal model evaluation involves comparing model predictions with true effects estimated from randomized controlled trials (RCT). However, RCTs are not always feasible or ethical to perform. In contrast, conditionally randomized experiments based on inverse probability weighting (IPW) offer a more realistic approach but may suffer from high estimation variance. To tackle this challenge and enhance causal model evaluation in real-world conditional randomization settings, we introduce a novel low-variance estimator for causal error, dubbed as the pairs estimator. By applying the same IPW estimator to both the model and true experimental effects, our estimator effectively cancels out the variance due to IPW and achieves a smaller asymptotic variance. Empirical studies demonstrate the improved of our estimator, highlighting its potential on achieving near-RCT performance.

conditional randomization, estimator, high precision causal model evaluation, (2 more...)

Neural Information Processing Systems

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.91)

Add feedback

Coresets for Decision Trees of Signals

Neural Information Processing SystemsJan-19-2025, 15:28:26 GMT

A k -decision tree t (or k -tree) is a recursive partition of a matrix (2D-signal) into k\geq 1 block matrices (axis-parallel rectangles, leaves) where each rectangle is assigned a real label. Its regression or classification loss to a given matrix D of N entries (labels) is the sum of squared differences over every label in D and its assigned label by t .Given an error parameter \varepsilon\in(0,1), a (k,\varepsilon) -coreset C of D is a small summarization that provably approximates this loss to \emph{every} such tree, up to a multiplicative factor of 1\pm\varepsilon . In particular, the optimal k -tree of C is a (1 \varepsilon) -approximation to the optimal k -tree of D .We provide the first algorithm that outputs such a (k,\varepsilon) -coreset for \emph{every} such matrix D . The size C of the coreset is polynomial in k\log(N)/\varepsilon, and its construction takes O(Nk) time.This is by forging a link between decision trees from machine learning -- to partition trees in computational geometry. Experimental results on \texttt{sklearn} and \texttt{lightGBM} show that applying our coresets on real-world data-sets boosts the computation time of random forests and their parameter tuning by up to x 10, while keeping similar accuracy. Full open source code is provided.

coreset, decision tree, varepsilon, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.88)

Add feedback

Decision Tree for Locally Private Estimation with Public Data

Neural Information Processing SystemsJan-19-2025, 13:09:44 GMT

We propose conducting locally differentially private (LDP) estimation with the aid of a small amount of public data to enhance the performance of private estimation. Specifically, we introduce an efficient algorithm called Locally differentially Private Decision Tree (LPDT) for LDP regression. We first use the public data to grow a decision tree partition and then fit an estimator according to the partition privately. From a theoretical perspective, we show that LPDT is \varepsilon -LDP and has a mini-max optimal convergence rate under a mild assumption of similarity between public and private data, whereas the lower bound of the convergence rate of LPDT without public data is strictly slower, which implies that the public data helps to improve the convergence rates of LDP estimation. We conduct experiments on both synthetic and real-world data to demonstrate the superior performance of LPDT compared with other state-of-the-art LDP regression methods. Moreover, we show that LPDT remains effective despite considerable disparities between public and private data.

decision tree, private estimation, public data, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.91)

Add feedback

Fair and Optimal Decision Trees: A Dynamic Programming Approach

Neural Information Processing SystemsJan-19-2025, 08:38:44 GMT

Interpretable and fair machine learning models are required for many applications, such as credit assessment and in criminal justice. Decision trees offer this interpretability, especially when they are small. Optimal decision trees are of particular interest because they offer the best performance possible for a given size. However, state-of-the-art algorithms for fair and optimal decision trees have scalability issues, often requiring several hours to find such trees even for small datasets. Previous research has shown that dynamic programming (DP) performs well for optimizing decision trees because it can exploit the tree structure.

constraint, dynamic programming approach, fair and optimal decision tree, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Identification of Partially Observed Linear Causal Models: Graphical Conditions for the Non-Gaussian and Heterogeneous Cases

Neural Information Processing SystemsJan-18-2025, 23:57:39 GMT

In causal discovery, linear non-Gaussian acyclic models (LiNGAMs) have been studied extensively. While the causally sufficient case is well understood, in many real problems the observed variables are not causally related. Rather, they are generated by latent variables, such as confounders and mediators, which may themselves be causally related. Existing results on the identification of the causal structure among the latent variables often require very strong graphical assumptions. In this paper, we consider partially observed linear models with either non-Gaussian or heterogeneous errors.

graphical condition, non-gaussian and heterogeneous case, observed linear causal model, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.45)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.40)

Add feedback

On Computing Probabilistic Explanations for Decision Trees

Neural Information Processing SystemsJan-18-2025, 16:09:57 GMT

Formal XAI (explainable AI) is a growing area that focuses on computing explanations with mathematical guarantees for the decisions made by ML models. Inside formal XAI, one of the most studied cases is that of explaining the choices taken by decision trees, as they are traditionally deemed as one of the most interpretable classes of models. Recent work has focused on studying the computation of sufficient reasons, a kind of explanation in which given a decision tree T and an instance x, one explains the decision T(x) by providing a subset y of the features of x such that for any other instance z compatible with y, it holds that T(z) T(x), intuitively meaning that the features in y are already enough to fully justify the classification of x by T . It has been argued, however, that sufficient reasons constitute a restrictive notion of explanation. For such a reason, the community has started to study their probabilistic counterpart, in which one requires that the probability of T(z) T(x) must be at least some value \delta \in (0, 1], where z is a random instance that is compatible with y .

computing probabilistic explanation, decision tree, delta -sufficient-reason, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.94)

Add feedback