Exploring space efficiency in a tree-based linear model for extreme multi-label classification