Interview with Henok Biadglign Ademtew: Creating an Amharic, Ge'ez and English parallel dataset

Jun-4-2024, 07:46:43 GMT–AIHub

African languages are not well-represented in natural language processing (NLP). This is in large part due to a lack of resources for training models. Henok Biadglign Ademtew and Mikiyas Girma Birbo have created an Amharic, Ge'ez, and English parallel dataset to help advance research into low-resource languages. We spoke to Henok about this project, the creation of the dataset, and some of the challenges faced. Most of the languages in Africa are very low-resourced, and not much text data is available.

artificial intelligence, dataset, natural language, (12 more...)

AIHub

Jun-4-2024, 07:46:43 GMT

News Web Page

Add feedback

Country:
- Africa > Ethiopia (0.06)
- Europe (0.05)

Technology:
- Information Technology > Artificial Intelligence > Natural Language (0.90)