Goto

Collaborating Authors

 hhcart


CART-ELC: Oblique Decision Tree Induction via Exhaustive Search

Laack, Andrew D.

arXiv.org Artificial Intelligence

Oblique decision trees have attracted attention due to their potential for improved classification performance over traditional axis-aligned decision trees. However, methods that rely on exhaustive search to find oblique splits face computational challenges. As a result, they have not been widely explored. We introduce a novel algorithm, Classification and Regression Tree - Exhaustive Linear Combinations (CART-ELC), for inducing oblique decision trees that performs an exhaustive search on a restricted set of hyperplanes. We then investigate the algorithm's computational complexity and its predictive capabilities. Our results demonstrate that CART-ELC consistently achieves competitive performance on small datasets, often yielding statistically significant improvements in classification accuracy relative to existing decision tree induction algorithms, while frequently producing shallower, simpler, and thus more interpretable trees.


HHCART: An Oblique Decision Tree

Wickramarachchi, D. C., Robertson, B. L., Reale, M., Price, C. J., Brown, J.

arXiv.org Machine Learning

Decision trees are a popular technique in statistical data classification. They recursively partition the feature space into disjoint sub-regions until each sub-region becomes homogeneous with respect to a particular class. The basic Classification and Regression Tree (CART) algorithm partitions the feature space using axis parallel splits. When the true decision boundaries are not aligned with the feature axes, this approach can produce a complicated boundary structure. Oblique decision trees use oblique decision boundaries to potentially simplify the boundary structure. The major limitation of this approach is that the tree induction algorithm is computationally expensive. In this article we present a new decision tree algorithm, called HHCART. The method utilizes a series of Householder matrices to reflect the training data at each node during the tree construction. Each reflection is based on the directions of the eigenvectors from each classes' covariance matrix. Considering axis parallel splits in the reflected training data provides an efficient way of finding oblique splits in the unreflected training data. Experimental results show that the accuracy and size of the HHCART trees are comparable with some benchmark methods in the literature. The appealing feature of HHCART is that it can handle both qualitative and quantitative features in the same oblique split.