optimal tree
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Illinois (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
The Double-Edged Nature of the Rashomon Set for Trustworthy Machine Learning
Hsu, Ethan, Chen, Harry, Zhong, Chudi, Semenova, Lesia
Real-world machine learning (ML) pipelines rarely produce a single model; instead, they produce a Rashomon set of many near-optimal ones. We show that this multiplicity reshapes key aspects of trustworthiness. At the individual-model level, sparse interpretable models tend to preserve privacy but are fragile to adversarial attacks. In contrast, the diversity within a large Rashomon set enables reactive robustness: even when an attack breaks one model, others often remain accurate. Rashomon sets are also stable under small distribution shifts. However, this same diversity increases information leakage, as disclosing more near-optimal models provides an attacker with progressively richer views of the training data. Through theoretical analysis and empirical studies of sparse decision trees and linear models, we characterize this robustness-privacy trade-off and highlight the dual role of Rashomon sets as both a resource and a risk for trustworthy ML.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Illinois (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- North America > Canada > British Columbia (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Experiments with Optimal Model Trees
Roselli, Sabino Francesco, Frank, Eibe
Model trees provide an appealing way to perform interpretable machine learning for both classification and regression problems. In contrast to ``classic'' decision trees with constant values in their leaves, model trees can use linear combinations of predictor variables in their leaf nodes to form predictions, which can help achieve higher accuracy and smaller trees. Typical algorithms for learning model trees from training data work in a greedy fashion, growing the tree in a top-down manner by recursively splitting the data into smaller and smaller subsets. Crucially, the selected splits are only locally optimal, potentially rendering the tree overly complex and less accurate than a tree whose structure is globally optimal for the training data. In this paper, we empirically investigate the effect of constructing globally optimal model trees for classification and regression with linear support vector machines at the leaf nodes. To this end, we present mixed-integer linear programming formulations to learn optimal trees, compute such trees for a large collection of benchmark data sets, and compare their performance against greedily grown model trees in terms of interpretability and accuracy. We also compare to classic optimal and greedily grown decision trees, random forests, and support vector machines. Our results show that optimal model trees can achieve competitive accuracy with very small trees. We also investigate the effect on the accuracy of replacing axis-parallel splits with multivariate ones, foregoing interpretability while potentially obtaining greater accuracy.
- Europe > Germany (0.04)
- Oceania > New Zealand > North Island > Waikato > Hamilton (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
An objective function for order preserving hierarchical clustering
We present a theory and an objective function for similarity-based hierarchical clustering of probabilistic partial orders and directed acyclic graphs (DAGs). Specifically, given elements $x \le y$ in the partial order, and their respective clusters $[x]$ and $[y]$, the theory yields an order relation $\le'$ on the clusters such that $[x]\le'[y]$. The theory provides a concise definition of order-preserving hierarchical clustering, and offers a classification theorem identifying the order-preserving trees (dendrograms). To determine the optimal order-preserving trees, we develop an objective function that frames the problem as a bi-objective optimisation, aiming to satisfy both the order relation and the similarity measure. We prove that the optimal trees under the objective are both order-preserving and exhibit high-quality hierarchical clustering. Since finding an optimal solution is NP-hard, we introduce a polynomial-time approximation algorithm and demonstrate that the method outperforms existing methods for order-preserving hierarchical clustering by a significant margin.
- Europe > Norway > Eastern Norway > Oslo (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- North America > United States > Utah (0.04)
- (8 more...)
Hierarchical Clustering via Local Search
In this paper, we introduce a local search algorithm for hierarchical clustering. For the local step, we consider a tree re-arrangement operation, known as the {\em interchange}, which involves swapping two closely positioned sub-trees within a tree hierarchy. The interchange operation has been previously used in the context of phylogenetic trees. As the objective function for evaluating the resulting hierarchies, we utilize the revenue function proposed by Moseley and Wang (NIPS 2017.) In our main result, we show that any locally optimal tree guarantees a revenue of at least $\frac{n-2}{3}\sum_{i < j}w(i,j)$ where is $n$ the number of objects and $w: [n] \times [n] \rightarrow \mathbb{R}^+$ is the associated similarity function. This finding echoes the previously established bound for the average link algorithm as analyzed by Moseley and Wang. We demonstrate that this alignment is not coincidental, as the average link trees enjoy the property of being locally optimal with respect to the interchange operation. Consequently, our study provides an alternative insight into the average link algorithm and reveals the existence of a broader range of hierarchies with relatively high revenue achievable through a straightforward local search algorithm. Furthermore, we present an implementation of the local search framework, where each local step requires $O(n)$ computation time. Our empirical results indicate that the proposed method, used as post-processing step, can effectively generate a hierarchical clustering with substantial revenue.
- North America > United States > California > Los Angeles County > Long Beach (0.14)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (10 more...)
Optimal Sparse Survival Trees
Zhang, Rui, Xin, Rui, Seltzer, Margo, Rudin, Cynthia
Interpretability is crucial for doctors, hospitals, pharmaceutical companies and biotechnology corporations to analyze and make decisions for high stakes problems that involve human health. Tree-based methods have been widely adopted for \textit{survival analysis} due to their appealing interpretablility and their ability to capture complex relationships. However, most existing methods to produce survival trees rely on heuristic (or greedy) algorithms, which risk producing sub-optimal models. We present a dynamic-programming-with-bounds approach that finds provably-optimal sparse survival tree models, frequently in only a few seconds.
- North America > United States > Massachusetts (0.04)
- North America > Canada > British Columbia (0.04)
- Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.92)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.88)
- (2 more...)
Bayesian CART models for insurance claims frequency
Zhang, Yaojun, Ji, Lanpeng, Aivaliotis, Georgios, Taylor, Charles
Accuracy and interpretability of a (non-life) insurance pricing model are essential qualities to ensure fair and transparent premiums for policy-holders, that reflect their risk. In recent years, the classification and regression trees (CARTs) and their ensembles have gained popularity in the actuarial literature, since they offer good prediction performance and are relatively easily interpretable. In this paper, we introduce Bayesian CART models for insurance pricing, with a particular focus on claims frequency modelling. Additionally to the common Poisson and negative binomial (NB) distributions used for claims frequency, we implement Bayesian CART for the zero-inflated Poisson (ZIP) distribution to address the difficulty arising from the imbalanced insurance claims data. To this end, we introduce a general MCMC algorithm using data augmentation methods for posterior tree exploration. We also introduce the deviance information criterion (DIC) for the tree model selection. The proposed models are able to identify trees which can better classify the policy-holders into risk groups. Some simulations and real insurance data will be discussed to illustrate the applicability of these models.
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Optimal Sparse Regression Trees
Zhang, Rui, Xin, Rui, Seltzer, Margo, Rudin, Cynthia
Regression trees are one of the oldest forms of AI models, and their predictions can be made without a calculator, which makes them broadly useful, particularly for high-stakes applications. Within the large literature on regression trees, there has been little effort towards full provable optimization, mainly due to the computational hardness of the problem. This work proposes a dynamic-programming-with-bounds approach to the construction of provably-optimal sparse regression trees. We leverage a novel lower bound based on an optimal solution to the k-Means clustering algorithm in 1-dimension over the set of labels. We are often able to find optimal sparse trees in seconds, even for challenging datasets that involve large numbers of samples and highly-correlated features.
- Asia > South Korea > Seoul > Seoul (0.05)
- North America > United States > Massachusetts (0.04)
- North America > Canada > British Columbia (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.86)