A no-regret generalization of hierarchical softmax to extreme multi-label classification

Marek Wydmuch, Kalina Jasinska, Mikhail Kuznetsov, Róbert Busa-Fekete, Krzysztof Dembczynski

May-26-2025, 08:08:07 GMT–Neural Information Processing Systems

Extreme multi-label classification (XMLC) is a problem of tagging an instance with a small subset of relevant labels chosen from an extremely large pool of possible labels. Large label spaces can be efficiently handled by organizing labels as a tree, like in the hierarchical softmax (HSM) approach commonly used for multi-class problems. In this paper, we investigate probabilistic label trees (PLTs) that have been recently devised for tackling XMLC problems. We show that PLTs are a no-regret multi-label generalization of HSM when precision@k is used as a model evaluation metric. Critically, we prove that pick-one-label heuristic--a reduction technique from multi-label to multi-class that is routinely used along with HSM--is not consistent in general.

artificial intelligence, machine learning, proceedings, (15 more...)

Neural Information Processing Systems

May-26-2025, 08:08:07 GMT

Conferences PDF

Add feedback

Country:
- Asia > Japan
  - Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Europe (1.00)
- North America
  - Canada (0.94)
  - United States > New York
    - New York County > New York City (0.14)
- Oceania > Australia
  - New South Wales > Sydney (0.14)

Genre:
- Research Report (0.68)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)