How Translation Alters Sentiment
Mohammad, Saif M., Salameh, Mohammad, Kiritchenko, Svetlana
–Journal of Artificial Intelligence Research
Sentiment analysis research has predominantly been on English texts. Thus there exist many sentiment resources for English, but less so for other languages. Approaches to improve sentiment analysis in a resource-poor focus language include: (a) translate the focus language text into a resource-rich language such as English, and apply a powerful English sentiment analysis system on the text, and (b) translate resources such as sentiment labeled corpora and sentiment lexicons from English into the focus language, and use them as additional resources in the focus-language sentiment analysis system. In this paper we systematically examine both options. We use Arabic social media posts as stand-in for the focus language text. We show that sentiment analysis of English translations of Arabic texts produces competitive results, w.r.t. Arabic sentiment analysis. We show that Arabic sentiment analysis systems benefit from the use of automatically translated English sentiment lexicons. We also conduct manual annotation studies to examine why the sentiment of a translation is different from the sentiment of the source word or text. This is especially relevant for building better automatic translation systems. In the process, we create a state-of-the-art Arabic sentiment analysis system, a new dialectal Arabic sentiment lexicon, and the first Arabic-English parallel corpus that is independently annotated for sentiment by Arabic and English speakers.
Journal of Artificial Intelligence Research
Jan-20-2016
- Country:
- North America
- United States
- Oregon > Multnomah County
- Portland (0.14)
- New York > New York County
- New York City (0.04)
- Georgia > Fulton County
- Atlanta (0.04)
- Colorado > Denver County
- Denver (0.04)
- California
- Los Angeles County > Los Angeles (0.14)
- Santa Clara County > Stanford (0.04)
- Orange County > Anaheim (0.04)
- Oregon > Multnomah County
- Canada
- United States
- Europe
- Sweden > Vaestra Goetaland
- Gothenburg (0.04)
- Middle East > Malta
- Port Region > Southern Harbour District > Valletta (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Iceland > Capital Region
- Reykjavik (0.04)
- Sweden > Vaestra Goetaland
- Asia > Middle East
- Africa > Middle East
- Egypt > Cairo Governorate > Cairo (0.04)
- North America
- Genre:
- Research Report > New Finding (0.93)
- Overview (0.93)
- Industry:
- Information Technology (0.46)
- Technology: