AITopics | Mohamed, Shafie Abdi

Collaborating Authors

Mohamed, Shafie Abdi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AfriMTE and AfriCOMET: Empowering COMET to Embrace Under-resourced African Languages

Wang, Jiayi, Adelani, David Ifeoluwa, Agrawal, Sweta, Rei, Ricardo, Briakou, Eleftheria, Carpuat, Marine, Masiak, Marek, He, Xuanli, Bourhim, Sofia, Bukula, Andiswa, Mohamed, Muhidin, Olatoye, Temitayo, Mokayede, Hamam, Mwase, Christine, Kimotho, Wangui, Yuehgoh, Foutse, Aremu, Anuoluwapo, Ojo, Jessica, Muhammad, Shamsuddeen Hassan, Osei, Salomey, Omotayo, Abdul-Hakeem, Chukwuneke, Chiamaka, Ogayo, Perez, Hourrane, Oumaima, Anigri, Salma El, Ndolela, Lolwethu, Mangwana, Thabiso, Mohamed, Shafie Abdi, Hassan, Ayinde, Awoyomi, Oluwabusayo Olufunke, Alkhaled, Lama, Al-Azzawi, Sana, Etori, Naome A., Ochieng, Millicent, Siro, Clemencia, Njoroge, Samuel, Muchiri, Eric, Kimotho, Wangari, Momo, Lyse Naomi Wamba, Abolade, Daud, Ajao, Simbiat, Adewumi, Tosin, Shode, Iyanuoluwa, Macharm, Ricky, Iro, Ruqayya Nasir, Abdullahi, Saheed S., Moore, Stephen E., Opoku, Bernard, Akinjobi, Zainab, Afolabi, Abeeb, Obiefuna, Nnaemeka, Ogbu, Onyekachi Raphael, Brian, Sam, Otiende, Verrah Akinyi, Mbonu, Chinedu Emmanuel, Sari, Sakayo Toadoum, Stenetorp, Pontus

arXiv.org Artificial IntelligenceNov-16-2023

Despite the progress we have recorded in scaling multilingual machine translation (MT) models and evaluation data to several under-resourced African languages, it is difficult to measure accurately the progress we have made on these languages because evaluation is often performed on n-gram matching metrics like BLEU that often have worse correlation with human judgments. Embedding-based metrics such as COMET correlate better; however, lack of evaluation data with human ratings for under-resourced languages, complexity of annotation guidelines like Multidimensional Quality Metrics (MQM), and limited language coverage of multilingual encoders have hampered their applicability to African languages. In this paper, we address these challenges by creating high-quality human evaluation data with a simplified MQM guideline for error-span annotation and direct assessment (DA) scoring for 13 typologically diverse African languages. Furthermore, we develop AfriCOMET, a COMET evaluation metric for African languages by leveraging DA training data from high-resource languages and African-centric multilingual encoder (AfroXLM-Roberta) to create the state-of-the-art evaluation metric for African languages MT with respect to Spearman-rank correlation with human judgments (+0.406).

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2311.09828

Country:

North America > United States (1.00)
Africa (1.00)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Lexicon and Rule-based Word Lemmatization Approach for the Somali Language

Mohamed, Shafie Abdi, Mohamed, Muhidin Abdullahi

arXiv.org Artificial IntelligenceAug-3-2023

The lemmatization summary statistics of the Example 3 sentence are also provided in Table 1. In this case, the percentage of words that were normalized for the example reached 100%, which means that all content words (excluding stop words and special characters) are lemmatized. This may be due to the fact that this is a short document, a sentence of 8 words. Unlike the lemmatization statistics of this example, a proportion of words in any typical text document (i.e., longer than a sentence) will normally remain unresolved - words that the algorithm fails to lemmatize in both stages. Overall and as part of evaluating the proposed method, we have tested the algorithm on 120 documents of various lengths including general news articles, and social media posts. For the news articles, we have used extracts (i.e., title and first 1-2 paragraphs) as well as the full articles to see the effect of document length. The results we found for these different document categories are summarized in Table 2. The notations #Docs, Avg Doc Len, and Avg Acc. in the table respectively represent the number of documents, average document length in words, and average lemmatization accuracy. As shown, the results demonstrate that the algorithm achieves a relatively good accuracy of 57% for moderately long documents (e.g.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2308.01785

Country:

North America > United States (0.28)
Africa > Middle East > Somalia (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback