You Shall Know a Tool by the Traces it Leaves: The Predictability of Sentiment Analysis Tools
Baumartz, Daniel, Bagci, Mevlüt, Henlein, Alexander, Konca, Maxim, Lücking, Andy, Mehler, Alexander
–arXiv.org Artificial Intelligence
If sentiment analysis tools were valid classifiers, one would expect them to provide comparable results for sentiment classification on different kinds of corpora and for different languages. In line with results of previous studies we show that sentiment analysis tools disagree on the same dataset. Going beyond previous studies we show that the sentiment tool used for sentiment annotation can even be predicted from its outcome, revealing an algorithmic bias of sentiment analysis. Based on Twitter, Wikipedia and different news corpora from the English, German and French languages, our classifiers separate sentiment tools with an averaged F1-score of 0.89 (for the English corpora). We therefore warn against taking sentiment annotations as face value and argue for the need of more and systematic NLP evaluation studies.
arXiv.org Artificial Intelligence
Oct-18-2024
- Country:
- Africa > Sudan (0.04)
- Oceania > Australia
- New South Wales > Sydney (0.04)
- North America
- Dominican Republic (0.04)
- Canada > Alberta (0.04)
- United States
- Oregon > Multnomah County
- Portland (0.04)
- New York > New York County
- New York City (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Michigan > Washtenaw County
- Ann Arbor (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Georgia > Fulton County
- Atlanta (0.04)
- Colorado > Denver County
- Denver (0.04)
- Oregon > Multnomah County
- Europe
- Slovenia (0.04)
- United Kingdom > Scotland (0.04)
- Czechia > Prague (0.04)
- Iceland > Capital Region
- Reykjavik (0.04)
- Spain > Valencian Community
- Valencia Province > Valencia (0.04)
- Italy > Tuscany
- Florence (0.04)
- Germany
- Saxony > Leipzig (0.04)
- Brandenburg > Potsdam (0.04)
- Berlin (0.04)
- Hesse > Darmstadt Region
- Frankfurt (0.04)
- Portugal > Lisbon
- Lisbon (0.04)
- France
- Île-de-France > Paris
- Paris (0.04)
- Provence-Alpes-Côte d'Azur > Bouches-du-Rhône
- Marseille (0.04)
- Île-de-France > Paris
- Middle East
- Sweden > Vaestra Goetaland
- Gothenburg (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Asia
- Singapore (0.04)
- Thailand > Phuket
- Phuket (0.04)
- Middle East
- Republic of Türkiye > Istanbul Province
- Istanbul (0.04)
- Qatar > Ad-Dawhah
- Doha (0.04)
- Republic of Türkiye > Istanbul Province
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Information Technology > Services (0.68)