AITopics | Abdaljalil, Samir

Collaborating Authors

Abdaljalil, Samir

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

HalluVerse25: Fine-grained Multilingual Benchmark Dataset for LLM Hallucinations

Abdaljalil, Samir, Kurban, Hasan, Serpedin, Erchin

arXiv.org Artificial IntelligenceMar-10-2025

Large Language Models (LLMs) are increasingly used in various contexts, yet remain prone to generating non-factual content, commonly referred to as "hallucinations". The literature categorizes hallucinations into several types, including entity-level, relation-level, and sentence-level hallucinations. However, existing hallucination datasets often fail to capture fine-grained hallucinations in multilingual settings. In this work, we introduce HalluVerse25, a multilingual LLM hallucination dataset that categorizes fine-grained hallucinations in English, Arabic, and Turkish. Our dataset construction pipeline uses an LLM to inject hallucinations into factual biographical sentences, followed by a rigorous human annotation process to ensure data quality. We evaluate several LLMs on HalluVerse25, providing valuable insights into how proprietary models perform in detecting LLM-generated hallucinations across different contexts.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.07833

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Texas > Brazos County > College Station (0.14)

Genre: Personal > Honors (0.95)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

SINdex: Semantic INconsistency Index for Hallucination Detection in LLMs

Abdaljalil, Samir, Kurban, Hasan, Sharma, Parichit, Serpedin, Erchin, Atat, Rachad

arXiv.org Artificial IntelligenceMar-7-2025

Large language models (LLMs) are increasingly deployed across diverse domains, yet they are prone to generating factually incorrect outputs - commonly known as "hallucinations." Among existing mitigation strategies, uncertainty-based methods are particularly attractive due to their ease of implementation, independence from external data, and compatibility with standard LLMs. In this work, we introduce a novel and scalable uncertainty-based semantic clustering framework for automated hallucination detection. Our approach leverages sentence embeddings and hierarchical clustering alongside a newly proposed inconsistency measure, SINdex, to yield more homogeneous clusters and more accurate detection of hallucination phenomena across various LLMs. Evaluations on prominent open- and closed-book QA datasets demonstrate that our method achieves AUROC improvements of up to 9.3% over state-of-the-art techniques. Extensive ablation studies further validate the effectiveness of each component in our framework.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.0598

Country:

Asia (0.68)
North America > United States > Indiana (0.28)
North America > United States > Texas > Brazos County > College Station (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

ArAIEval Shared Task: Persuasion Techniques and Disinformation Detection in Arabic Text

Hasanain, Maram, Alam, Firoj, Mubarak, Hamdy, Abdaljalil, Samir, Zaghouani, Wajdi, Nakov, Preslav, Martino, Giovanni Da San, Freihat, Abed Alhakim

arXiv.org Artificial IntelligenceNov-6-2023

We present an overview of the ArAIEval shared task, organized as part of the first ArabicNLP 2023 conference co-located with EMNLP 2023. ArAIEval offers two tasks over Arabic text: (i) persuasion technique detection, focusing on identifying persuasion techniques in tweets and news articles, and (ii) disinformation detection in binary and multiclass setups over tweets. A total of 20 teams participated in the final evaluation phase, with 14 and 16 teams participating in Tasks 1 and 2, respectively. Across both tasks, we observed that fine-tuning transformer models such as AraBERT was at the core of the majority of the participating systems. We provide a description of the task setup, including a description of the dataset construction and the evaluation setup. We further give a brief overview of the participating systems. All datasets and evaluation scripts from the shared task are released to the research community. (https://araieval.gitlab.io/) We hope this will enable further research on these important tasks in Arabic.

computational linguistic, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2311.03179

Country:

Europe > Italy (0.28)
Europe > Austria > Vienna (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)

Genre: Overview (1.00)

Industry: Media > News (1.00)

Technology:

Information Technology > Communications > Social Media (0.70)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

LLMeBench: A Flexible Framework for Accelerating LLMs Benchmarking

Dalvi, Fahim, Hasanain, Maram, Boughorbel, Sabri, Mousi, Basel, Abdaljalil, Samir, Nazar, Nizi, Abdelali, Ahmed, Chowdhury, Shammur Absar, Mubarak, Hamdy, Ali, Ahmed, Hawasly, Majd, Durrani, Nadir, Alam, Firoj

arXiv.org Artificial IntelligenceAug-9-2023

The recent development and success of Large Language Models (LLMs) necessitate an evaluation of their performance across diverse NLP tasks in different languages. Although several frameworks have been developed and made publicly available, their customization capabilities for specific tasks and datasets are often complex for different users. In this study, we introduce the LLMeBench framework. Initially developed to evaluate Arabic NLP tasks using OpenAI's GPT and BLOOM models; it can be seamlessly customized for any NLP task and model, regardless of language. The framework also features zero- and few-shot learning settings. A new custom dataset can be added in less than 10 minutes, and users can use their own model API keys to evaluate the task at hand. The developed framework has been already tested on 31 unique NLP tasks using 53 publicly available datasets within 90 experimental setups, involving approximately 296K data points. We plan to open-source the framework for the community (https://github.com/qcri/LLMeBench/). A video demonstrating the framework is available online (https://youtu.be/FkQn4UjYA0s).

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2308.04945

Country:

Asia (0.14)
North America > Canada (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Detecting and Reasoning of Deleted Tweets before they are Posted

Mubarak, Hamdy, Abdaljalil, Samir, Nassar, Azza, Alam, Firoj

arXiv.org Artificial IntelligenceMay-5-2023

Social media platforms empower us in several ways, from information dissemination to consumption. While these platforms are useful in promoting citizen journalism, public awareness etc., they have misuse potentials. Malicious users use them to disseminate hate-speech, offensive content, rumor etc. to gain social and political agendas or to harm individuals, entities and organizations. Often times, general users unconsciously share information without verifying it, or unintentionally post harmful messages. Some of such content often get deleted either by the platform due to the violation of terms and policies, or users themselves for different reasons, e.g., regrets. There is a wide range of studies in characterizing, understanding and predicting deleted content. However, studies which aims to identify the fine-grained reasons (e.g., posts are offensive, hate speech or no identifiable reason) behind deleted content, are limited. In this study we address this gap, by identifying deleted tweets, particularly within the Arabic context, and labeling them with a corresponding fine-grained disinformation category. We then develop models that can predict the potentiality of tweets getting deleted, as well as the potential reasons behind deletion. Such models can help in moderating social media posts before even posting.

machine learning, natural language, tweet, (19 more...)

arXiv.org Artificial Intelligence

2305.04927

Country:

North America > United States (0.46)
Asia > Middle East > Qatar (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Media > News (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(3 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback