Incrementally Learning the Hierarchical Softmax Function for Neural Language Models