On Learning Vector Representations in Hierarchical Label Spaces