Interview with Henok Biadglign Ademtew: Creating an Amharic, Ge'ez and English parallel dataset
African languages are not well-represented in natural language processing (NLP). This is in large part due to a lack of resources for training models. Henok Biadglign Ademtew and Mikiyas Girma Birbo have created an Amharic, Ge'ez, and English parallel dataset to help advance research into low-resource languages. We spoke to Henok about this project, the creation of the dataset, and some of the challenges faced. Most of the languages in Africa are very low-resourced, and not much text data is available.
Jun-4-2024, 07:46:43 GMT
- Technology: