AITopics | tunisian dialect

Collaborating Authors

tunisian dialect

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

How Well Do LLMs Understand Tunisian Arabic?

Mahdi, Mohamed

arXiv.org Artificial IntelligenceNov-24-2025

Large Language Models (LLMs) are the engines driving today's AI agents. The better these models understand human languages, the more natural and user-friendly the interaction with AI becomes, from everyday devices like computers and smartwatches to any tool that can act intelligently. Yet, the ability of industrial-scale LLMs to comprehend low-resource languages, such as Tunisian Arabic (Tunizi), is often overlooked. This neglect risks excluding millions of Tunisians from fully interacting with AI in their own language, pushing them toward French or English. Such a shift not only threatens the preservation of the Tunisian dialect but may also create challenges for literacy and influence younger generations to favor foreign languages. In this study, we introduce a novel dataset containing parallel Tunizi, standard Tunisian Arabic, and English translations, along with sentiment labels. We benchmark several popular LLMs on three tasks: transliteration, translation, and sentiment analysis. Our results reveal significant differences between models, highlighting both their strengths and limitations in understanding and processing Tunisian dialects. By quantifying these gaps, this work underscores the importance of including low-resource languages in the next generation of AI systems, ensuring technology remains accessible, inclusive, and culturally grounded.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2511.16683

Country: Africa > Middle East > Tunisia (0.15)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

TEDxTN: A Three-way Speech Translation Corpus for Code-Switched Tunisian Arabic - English

Bougares, Fethi, Mdhaffar, Salima, Elleuch, Haroun, Estève, Yannick

arXiv.org Artificial IntelligenceNov-17-2025

In this paper, we introduce TEDxTN, the first publicly available Tunisian Arabic to English speech translation dataset. This work is in line with the ongoing effort to mitigate the data scarcity obstacle for a number of Arabic dialects. We collected, segmented, transcribed and translated 108 TEDx talks following our internally developed annotations guidelines. The collected talks represent 25 hours of speech with code-switching that cover speakers with various accents from over 11 different regions of Tunisia. We make the annotation guidelines and corpus publicly available. This will enable the extension of TEDxTN to new talks as they become available. We also report results for strong baseline systems of Speech Recognition and Speech Translation using multiple pre-trained and fine-tuned end-to-end models. This corpus is the first open source and publicly available speech translation corpus of Code-Switching Tunisian dialect. We believe that this is a valuable resource that can motivate and facilitate further research on the natural language processing of Tunisian Dialect.

artificial intelligence, corpus, natural language, (15 more...)

arXiv.org Artificial Intelligence

2511.1078

Country:

Europe (1.00)
Africa > Middle East > Tunisia (0.25)

Genre: Research Report (0.64)

Industry: Education > Educational Setting (0.94)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Machine Translation: Tunisian Dialect -- English is it possible?

#artificialintelligenceSep-19-2021, 22:30:29 GMT

In the context of a final Specialty project we were given a month to develop an idea that demonstrates our newly acquired knowledge and techniques.lesson Choosing Machine Learning among all the available specialties was a risky lesson synonymstep that we embarked and enjoyed it, mostly. Covering its mathematical and statistical concepts, supervised learning, unsupervised learning, reinforcement learning we had a wide range of sub-fields to work on but we considered two factors: What time allowed us to do and what area we felt most able to work on. That is how our choice was directed primarily to Natural Language Processing than w had decided what exactly to do? Last year we worked on a bedtime stories application that collected Tunisian folkloric stories and we were asked by one of the presentation jury if is it possible to implement a text to speech feature?

application, machine translation, tunisian dialect, (2 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.52)

Add feedback