Joint POS Tagging and Text Normalization for Informal Text
Li, Chen (University of Texas at Dallas) | Liu, Yang (University of Texas at Dallas)
Text normalization and part-of-speech (POS) tagging for social media data have been investigated recently, however, prior work has treated them separately. In this paper, we propose a joint Viterbi decoding process to determine each token’s POS tag and non-standard token’s correct form at the same time. In order to evaluate our approach, we create two new data sets with POS tag labels and non-standard tokens' correct forms. This is the first data set with such annotation. The experiment results demonstrate the effect of non-standard words on POS tagging, and also show that our proposed methods perform better than the state-of-the-art systems in both POS tagging and normalization
Jul-15-2015
- Country:
- North America > United States > Texas > Dallas County > Richardson (0.04)
- Genre:
- Research Report > New Finding (0.88)
- Industry:
- Information Technology > Services (0.46)
- Technology: