MasakhaNER: Named Entity Recognition for African Languages
Adelani, David Ifeoluwa, Abbott, Jade, Neubig, Graham, D'souza, Daniel, Kreutzer, Julia, Lignos, Constantine, Palen-Michel, Chester, Buzaaba, Happy, Rijhwani, Shruti, Ruder, Sebastian, Mayhew, Stephen, Azime, Israel Abebe, Muhammad, Shamsuddeen, Emezue, Chris Chinenye, Nakatumba-Nabende, Joyce, Ogayo, Perez, Aremu, Anuoluwapo, Gitau, Catherine, Mbaye, Derguene, Alabi, Jesujoba, Yimam, Seid Muhie, Gwadabe, Tajuddeen, Ezeani, Ignatius, Niyongabo, Rubungo Andre, Mukiibi, Jonathan, Otiende, Verrah, Orife, Iroro, David, Davis, Ngom, Samba, Adewumi, Tosin, Rayson, Paul, Adeyemi, Mofetoluwa, Muriuki, Gerald, Anebi, Emmanuel, Chukwuneke, Chiamaka, Odu, Nkiruka, Wairagala, Eric Peter, Oyerinde, Samuel, Siro, Clemencia, Bateesa, Tobius Saul, Oloyede, Temilola, Wambui, Yvonne, Akinode, Victor, Nabagereka, Deborah, Katusiime, Maurice, Awokoya, Ayodele, MBOUP, Mouhamadane, Gebreyohannes, Dibora, Tilaye, Henok, Nwaike, Kelechi, Wolde, Degaga, Faye, Abdoulaye, Sibanda, Blessing, Ahia, Orevaoghene, Dossou, Bonaventure F. P., Ogueji, Kelechi, DIOP, Thierno Ibrahima, Diallo, Abdoulaye, Akinfaderin, Adewale, Marengereke, Tendai, Osei, Salomey
–arXiv.org Artificial Intelligence
We take a step towards addressing the under-representation of the African continent in NLP research by creating the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages, bringing together a variety of stakeholders. We detail characteristics of the languages to help researchers understand the challenges that these languages pose for NER. We analyze our datasets and conduct an extensive empirical evaluation of state-of-the-art methods across both supervised and transfer learning settings. We release the data, code, and models in order to inspire future research on African NLP.
arXiv.org Artificial Intelligence
Mar-22-2021
- Country:
- South America > Brazil (0.04)
- Oceania > Australia
- North America
- Cuba (0.04)
- Canada (0.04)
- United States
- New Mexico > Santa Fe County
- Santa Fe (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Colorado > Boulder County
- Boulder (0.04)
- California > San Francisco County
- San Francisco (0.14)
- New Mexico > Santa Fe County
- Europe
- Slovenia (0.04)
- United Kingdom > England
- Oxfordshire > Oxford (0.04)
- Sweden > Norrbotten County
- Luleå (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Iceland > Capital Region
- Reykjavik (0.04)
- Germany
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Asia
- China (0.04)
- Myanmar > Tanintharyi Region
- Dawei (0.04)
- Middle East
- Japan > Honshū
- Kantō > Ibaraki Prefecture > Tsukuba (0.04)
- Africa
- Niger (0.05)
- Benin (0.04)
- Tanzania (0.04)
- Ghana (0.04)
- West Africa (0.04)
- Burundi (0.04)
- Cameroon (0.04)
- East Africa (0.04)
- The Gambia (0.04)
- Côte d'Ivoire (0.04)
- Zambia (0.04)
- Middle East > Somalia (0.04)
- Democratic Republic of the Congo (0.04)
- Malawi (0.04)
- Mozambique (0.04)
- Mauritania (0.04)
- Namibia (0.04)
- South Africa (0.04)
- Ethiopia (0.04)
- Burkina Faso (0.04)
- Sierra Leone (0.04)
- Togo (0.04)
- Rwanda > Southern Province
- Nyanza (0.04)
- Nigeria
- Cross River State > Calabar (0.04)
- Oyo State > Ibadan (0.04)
- Lagos State > Lagos (0.04)
- Federal Capital Territory > Abuja (0.04)
- Uganda > Central Region
- Kampala (0.04)
- Kenya
- Siaya County > Siaya (0.04)
- Kisumu County > Kisumu (0.04)
- Kisii County > Kisii (0.04)
- Senegal > Dakar Region
- Dakar (0.04)
- Genre:
- Research Report (0.84)
- Technology: