A Bayesian LDA-based model for semi-supervised part-of-speech tagging