ExKMC: Expanding Explainable $k$-Means Clustering

Frost, Nave, Moshkovitz, Michal, Rashtchian, Cyrus

Jul-1-2020–arXiv.org Machine Learning

Despite the popularity of explainable AI, there is limited work on effective methods for unsupervised learning. We study algorithms for $k$-means clustering, focusing on a trade-off between explainability and accuracy. Following prior work, we use a small decision tree to partition a dataset into $k$ clusters. This enables us to explain each cluster assignment by a short sequence of single-feature thresholds. While larger trees produce more accurate clusterings, they also require more complex explanations. To allow flexibility, we develop a new explainable $k$-means clustering algorithm, ExKMC, that takes an additional parameter $k' \geq k$ and outputs a decision tree with $k'$ leaves. We use a new surrogate cost to efficiently expand the tree and to label the leaves with one of $k$ clusters. We prove that as $k'$ increases, the surrogate cost is non-increasing, and hence, we trade explainability for accuracy. Empirically, we validate that ExKMC produces a low cost clustering, outperforming both standard decision tree methods and other algorithms for explainable clustering. Implementation of ExKMC available at https://github.com/navefr/ExKMC.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

Jul-1-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Pennsylvania > Allegheny County
    - Pittsburgh (0.04)
  - California > San Diego County
    - San Diego (0.04)
- Asia
  - Middle East > Israel
    - Tel Aviv District > Tel Aviv (0.04)
  - Afghanistan > Parwan Province
    - Charikar (0.04)

Genre:
- Workflow (0.68)
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning
    - Statistical Learning > Clustering (1.00)
    - Decision Tree Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found