Learning about Spanish dialects through Twitter
Gonçalves, Bruno, Sánchez, David
This paper maps the large-scale variation of the Spanish language by employing a corpus based on geographically tagged Twitter messages. Lexical dialects are extracted from an analysis of variants of tens of concepts. The resulting maps show linguistic variation on an unprecedented scale across the globe. We discuss the properties of the main dialects within a machine learning approach and find that varieties spoken in urban areas have an international character in contrast to country areas where dialects show a more regional uniformity.
Feb-5-2017
- Country:
- Asia > Japan
- Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
- Atlantic Ocean > South Atlantic Ocean
- Río De La Plata (0.05)
- Europe
- North Macedonia (0.04)
- Spain (0.05)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.05)
- North America
- Central America (0.05)
- Cuba (0.04)
- Mexico (0.14)
- United States
- California > Santa Clara County
- Palo Alto (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- New York (0.04)
- California > Santa Clara County
- South America
- Argentina > Pampas
- Buenos Aires F.D. > Buenos Aires (0.04)
- Chile (0.04)
- Colombia (0.05)
- Paraguay (0.04)
- Uruguay (0.04)
- Argentina > Pampas
- Asia > Japan
- Genre:
- Research Report > New Finding (0.47)
- Industry:
- Technology: