AITopics | proverb

Collaborating Authors

proverb

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Beyond Understanding: Evaluating the Pragmatic Gap in LLMs' Cultural Processing of Figurative Language

Attia, Mena, Muhamed, Aashiq, Alkhamissi, Mai, Solorio, Thamar, Diab, Mona

arXiv.org Artificial IntelligenceOct-29-2025

We present a comprehensive evaluation of the ability of large language models (LLMs) to process culturally grounded language, specifically to understand and pragmatically use figurative expressions that encode local knowledge and cultural nuance. Using figurative language as a proxy for cultural nuance and local knowledge, we design evaluation tasks for contextual understanding, pragmatic use, and connotation interpretation in Arabic and English. We evaluate 22 open- and closed-source LLMs on Egyptian Arabic idioms, multidialectal Arabic proverbs, and English proverbs. Our results show a consistent hierarchy: the average accuracy for Arabic proverbs is 4.29% lower than for English proverbs, and performance for Egyptian idioms is 10.28% lower than for Arabic proverbs. For the pragmatic use task, accuracy drops by 14.07% relative to understanding, though providing contextual idiomatic sentences improves accuracy by 10.66%. Models also struggle with connotative meaning, reaching at most 85.58% agreement with human annotators on idioms with 100% inter-annotator agreement. These findings demonstrate that figurative language serves as an effective diagnostic for cultural reasoning: while LLMs can often interpret figurative meaning, they face challenges in using it appropriately. To support future research, we release Kinayat, the first dataset of Egyptian Arabic idioms designed for both figurative understanding and pragmatic use evaluation.

explanation, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2510.23828

Country:

North America > United States (1.00)
Europe (1.00)
Asia > Middle East > UAE (0.46)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

BRoverbs -- Measuring how much LLMs understand Portuguese proverbs

Almeida, Thales Sales, Bonás, Giovana Kerche, Santos, João Guilherme Alves

arXiv.org Artificial IntelligenceSep-12-2025

Large Language Models (LLMs) exhibit significant performance variations depending on the linguistic and cultural context in which they are applied. This disparity signals the necessity of mature evaluation frameworks that can assess their capabilities in specific regional settings. In the case of Portuguese, existing evaluations remain limited, often relying on translated datasets that may not fully capture linguistic nuances or cultural references. Meanwhile, native Portuguese-language datasets predominantly focus on structured national exams or sentiment analysis of social media interactions, leaving gaps in evaluating broader linguistic understanding. To address this limitation, we introduce BRoverbs, a dataset specifically designed to assess LLM performance through Brazilian proverbs. Proverbs serve as a rich linguistic resource, encapsulating cultural wisdom, figurative expressions, and complex syntactic structures that challenge the model comprehension of regional expressions. BRoverbs aims to provide a new evaluation tool for Portuguese-language LLMs, contributing to advancing regionally informed benchmarking. The benchmark is available at https://huggingface.co/datasets/Tropic-AI/BRoverbs.

large language model, machine learning, proverb, (18 more...)

arXiv.org Artificial Intelligence

2509.0896

Country: South America > Brazil (0.14)

Genre: Research Report (0.64)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Dialect Normalization using Large Language Models and Morphological Rules

Dimakis, Antonios, Pavlopoulos, John, Anastasopoulos, Antonios

arXiv.org Artificial IntelligenceJun-11-2025

Natural language understanding systems struggle with low-resource languages, including many dialects of high-resource ones. Dialect-to-standard normalization attempts to tackle this issue by transforming dialectal text so that it can be used by standard-language tools downstream. In this study, we tackle this task by introducing a new normalization method that combines rule-based linguistically informed transformations and large language models (LLMs) with targeted few-shot prompting, without requiring any parallel data. We implement our method for Greek dialects and apply it on a dataset of regional proverbs, evaluating the outputs using human annotators. We then use this dataset to conduct downstream experiments, finding that previous results regarding these proverbs relied solely on superficial linguistic information, including orthographic artifacts, while new observations can still be made through the remaining semantics.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2506.08907

Country:

Europe (1.00)
Asia > Middle East > UAE (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Add feedback

Culturally-Grounded Chain-of-Thought (CG-CoT):Enhancing LLM Performance on Culturally-Specific Tasks in Low-Resource Languages

Thakur, Madhavendra

arXiv.org Artificial IntelligenceJun-3-2025

Addressing this gap is crucial for equitable AI deployment. We introduce Culturally-Grounded Chain-of-Thought (CG-CoT), a novel prompting strategy that combines dense vector retrieval of cultural context with explicit reasoning sequences. Our extensive experiments on Yoruba proverb interpretation demonstrate that CG-CoT provides significantly higher culturally-aligned accuracy and depth than traditional prompting methods, validated through both automated metrics and LLM-based evaluations. Notably, we uncover stark disparities between token-level translation metrics like BLEU and human-judged cultural relevance, suggesting a rethinking of evaluation approaches for low-resource NLP.

artificial intelligence, large language model, natural language, (14 more...)

arXiv.org Artificial Intelligence

2506.0119

Genre: Research Report (0.83)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Jawaher: A Multidialectal Dataset of Arabic Proverbs for LLM Benchmarking

Magdy, Samar M., Kwon, Sang Yun, Alwajih, Fakhraddin, Abdelfadil, Safaa, Shehata, Shady, Abdul-Mageed, Muhammad

arXiv.org Artificial IntelligenceFeb-28-2025

Recent advancements in instruction fine-tuning, alignment methods such as reinforcement learning from human feedback (RLHF), and optimization techniques like direct preference optimization (DPO) have significantly enhanced the adaptability of large language models (LLMs) to user preferences. However, despite these innovations, many LLMs continue to exhibit biases toward Western, Anglo-centric, or American cultures, with performance on English data consistently surpassing that of other languages. This reveals a persistent cultural gap in LLMs, which complicates their ability to accurately process culturally rich and diverse figurative language such as proverbs. To address this, we introduce Jawaher, a benchmark designed to assess LLMs' capacity to comprehend and interpret Arabic proverbs. Jawaher includes proverbs from various Arabic dialects, along with idiomatic translations and explanations. Through extensive evaluations of both open- and closed-source models, we find that while LLMs can generate idiomatically accurate translations, they struggle with producing culturally nuanced and contextually relevant explanations. These findings highlight the need for ongoing model refinement and dataset expansion to bridge the cultural gap in figurative language processing.

explanation, proverb, translation, (13 more...)

arXiv.org Artificial Intelligence

2503.00231

Country:

Africa > Sudan (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Africa > Middle East > Libya (0.04)
(21 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Proverbs Run in Pairs: Evaluating Proverb Translation Capability of Large Language Model

Wang, Minghan, Pham, Viet-Thanh, Moghimifar, Farhad, Vu, Thuy-Trang

arXiv.org Artificial IntelligenceJan-21-2025

Despite achieving remarkable performance, machine translation (MT) research remains underexplored in terms of translating cultural elements in languages, such as idioms, proverbs, and colloquial expressions. This paper investigates the capability of state-of-the-art neural machine translation (NMT) and large language models (LLMs) in translating proverbs, which are deeply rooted in cultural contexts. We construct a translation dataset of standalone proverbs and proverbs in conversation for four language pairs. Our experiments show that the studied models can achieve good translation between languages with similar cultural backgrounds, and LLMs generally outperform NMT models in proverb translation. Furthermore, we find that current automatic evaluation metrics such as BLEU, CHRF++ and COMET are inadequate for reliably assessing the quality of proverb translation, highlighting the need for more culturally aware evaluation metrics.

large language model, machine learning, translation, (19 more...)

arXiv.org Artificial Intelligence

2501.11953

Country:

Europe (1.00)
Asia (1.00)
North America > United States (0.67)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

ProverbEval: Exploring LLM Evaluation Challenges for Low-resource Language Understanding

Azime, Israel Abebe, Tonja, Atnafu Lambebo, Belay, Tadesse Destaw, Chanie, Yonas, Balcha, Bontu Fufa, Abadi, Negasi Haile, Ademtew, Henok Biadglign, Nerea, Mulubrhan Abebe, Yadeta, Debela Desalegn, Geremew, Derartu Dagne, tesfau, Assefa Atsbiha, Slusallek, Philipp, Solorio, Thamar, Klakow, Dietrich

arXiv.org Artificial IntelligenceNov-16-2024

With the rapid development of evaluation datasets to assess LLMs understanding across a wide range of subjects and domains, identifying a suitable language understanding benchmark has become increasingly challenging. In this work, we explore LLM evaluation challenges for low-resource language understanding and introduce ProverbEval, LLM evaluation benchmark for low-resource languages based on proverbs to focus on low-resource language understanding in culture-specific scenarios. We benchmark various LLMs and explore factors that create variability in the benchmarking process. We observed performance variances of up to 50%, depending on the order in which answer choices were presented in multiple-choice tasks. Native language proverb descriptions significantly improve tasks such as proverb generation, contributing to improved outcomes. Additionally, monolingual evaluations consistently outperformed their cross-lingual counterparts. We argue special attention must be given to the order of choices, choice of prompt language, task variability, and generation tasks when creating LLM evaluation benchmarks.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2411.05049

Country:

Asia > Thailand > Bangkok > Bangkok (0.04)
North America > United States > Texas > Dallas County > Dallas (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(8 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

A Statistical Analysis of LLMs' Self-Evaluation Using Proverbs

Sonoda, Ryosuke, Srinivasan, Ramya

arXiv.org Artificial IntelligenceOct-21-2024

Large language models (LLMs) such as ChatGPT, GPT-4, Claude-3, and Llama are being integrated across a variety of industries. Despite this rapid proliferation, experts are calling for caution in the interpretation and adoption of LLMs, owing to numerous associated ethical concerns. Research has also uncovered shortcomings in LLMs' reasoning and logical abilities, raising questions on the potential of LLMs as evaluation tools. In this paper, we investigate LLMs' self-evaluation capabilities on a novel proverb reasoning task. We introduce a novel proverb database consisting of 300 proverb pairs that are similar in intent but different in wordings, across topics spanning gender, wisdom, and society. We propose tests to evaluate textual consistencies as well as numerical consistencies across similar proverbs, and demonstrate the effectiveness of our method and dataset in identifying failures in LLMs' self-evaluation which in turn can highlight issues related to gender stereotypes and lack of cultural understanding in LLMs.

large language model, machine learning, proverb, (20 more...)

arXiv.org Artificial Intelligence

2410.1664

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Guylingo: The Republic of Guyana Creole Corpora

Clarke, Christopher, Daynauth, Roland, Wilkinson, Charlene, Devonish, Hubert, Mars, Jason

arXiv.org Artificial IntelligenceJul-2-2024

While major languages often enjoy substantial attention and resources, the linguistic diversity across the globe encompasses a multitude of smaller, indigenous, and regional languages that lack the same level of computational support. One such region is the Caribbean. While commonly labeled as "English speaking", the ex-British Caribbean region consists of a myriad of Creole languages thriving alongside English. In this paper, we present Guylingo: a comprehensive corpus designed for advancing NLP research in the domain of Creolese (Guyanese English-lexicon Creole), the most widely spoken language in the culturally rich nation of Guyana. We first outline our framework for gathering and digitizing this diverse corpus, inclusive of colloquial expressions, idioms, and regional variations in a low-resource language. We then demonstrate the challenges of training and evaluating NLP models for machine translation in Creole. Lastly, we discuss the unique opportunities presented by recent NLP advancements for accelerating the formal adoption of Creole languages as official languages in the Caribbean.

creole, creole language, translation, (15 more...)

arXiv.org Artificial Intelligence

2405.03832

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
Atlantic Ocean > South Atlantic Ocean > Gulf of Guinea (0.04)
Africa > Gulf of Guinea (0.04)
(10 more...)

Genre: Research Report (0.50)

Industry:

Education (0.68)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.70)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Are Multilingual LLMs Culturally-Diverse Reasoners? An Investigation into Multicultural Proverbs and Sayings

Liu, Chen Cecilia, Koto, Fajri, Baldwin, Timothy, Gurevych, Iryna

arXiv.org Artificial IntelligenceSep-15-2023

Large language models (LLMs) are highly adept at question answering and reasoning tasks, but when reasoning in situational context, human expectations vary depending on the relevant cultural common ground. As human languages are associated with diverse cultures, LLMs should also be culturally-diverse reasoners. In this paper, we study the ability of a wide range of state-of-the-art multilingual LLMs (mLLMs) to reason with proverbs and sayings in a conversational context. Our experiments reveal that: (1) mLLMs 'knows' limited proverbs and memorizing proverbs does not mean understanding them within a conversational context; (2) mLLMs struggle to reason with figurative proverbs and sayings, and when asked to select the wrong answer (instead of asking it to select the correct answer); and (3) there is a "culture gap" in mLLMs when reasoning about proverbs and sayings translated from other languages. We construct and release our evaluation dataset MAPS (MulticultrAl Proverbs and Sayings) for proverb understanding with conversational context for six different languages.

computational linguistic, param num, proverb, (14 more...)

arXiv.org Artificial Intelligence

2309.08591

Country:

North America > Canada > Ontario > Toronto (0.05)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
(10 more...)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback