Improving part-of-speech tagging via multi-task learning and character-level word representations

Anastasyev, Daniil, Gusev, Ilya, Indenbom, Eugene

Jul-2-2018–arXiv.org Machine Learning

In this paper, we explore the ways to improve POS-tagging using various types of auxiliary losses and different word representations. As a baseline, we utilized a BiLSTM tagger, which is able to achieve state-of-the-art results on the sequence labelling tasks. We developed a new method for character-level word representation using feedforward neural network. Such representation gave us better results in terms of speed and performance of the model. We also applied a novel technique of pretraining such word representations with existing word vectors. Finally, we designed a new variant of auxiliary loss for sequence labelling tasks: an additional prediction of the neighbour labels. Such loss forces a model to learn the dependencies in-side a sequence of labels and accelerates the process of training. We test these methods on English and Russian languages.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Machine Learning

Jul-2-2018

arXiv.org PDF

Add feedback

Country:
- Asia > Russia (0.04)
- Europe > Russia
  - Central Federal District > Moscow Oblast > Moscow (0.05)
- North America > United States
  - California > San Diego County > San Diego (0.04)

Genre:
- Research Report
  - New Finding (0.47)
  - Promising Solution (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.97)
  - Natural Language (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found