Inducing a hierarchy for multi-class classification problems
Helm, Hayden S., Yang, Weiwei, Bharadwaj, Sujeeth, Lytvynets, Kate, Riva, Oriana, White, Christopher, Geisa, Ali, Priebe, Carey E.
In applications where categorical labels follow a natural hierarchy, classification methods that exploit the label structure often outperform those that do not. Unfortunately, the majority of classification datasets do not come pre-equipped with a hierarchical structure and classical "flat" classifiers must be employed. In this paper, we investigate a class of methods that induce a hierarchy that can similarly improve classification performance over flat classifiers. The class of methods follows the structure of first clustering the conditional distributions and subsequently using a hierarchical classifier with the induced hierarchy. We demonstrate the effectiveness of the class of methods both for discovering a latent hierarchy and for improving accuracy in principled simulation settings and three real data applications. Machine learning practitioners are often challenged with the task of classifying an object as one of tens or hundreds of classes. To address these problems, algorithms originally designed for binary or small multi-class problems are applied and naively deployed. In some instances the large set of labels comes pre-equipped with a hierarchical structure - that is, some labels are known to be mutually semantically similar to various degrees.
Feb-20-2021
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Health & Medicine > Diagnostic Medicine > Imaging (0.46)
- Technology: