AITopics | Moosavi, Nafise Sadat

Collaborating Authors

Moosavi, Nafise Sadat

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Exploring Gender Disparities in Automatic Speech Recognition Technology

ElGhazaly, Hend, Mirheidari, Bahman, Moosavi, Nafise Sadat, Christensen, Heidi

arXiv.org Artificial IntelligenceFeb-25-2025

This study investigates factors influencing Automatic Speech Recognition (ASR) systems' fairness and performance across genders, beyond the conventional examination of demographics. Using the LibriSpeech dataset and the Whisper small model, we analyze how performance varies across different gender representations in training data. Our findings suggest a complex interplay between the gender ratio in training data and ASR performance. Optimal fairness occurs at specific gender distributions rather than a simple 50-50 split. Furthermore, our findings suggest that factors like pitch variability can significantly affect ASR accuracy. This research contributes to a deeper understanding of biases in ASR systems, highlighting the importance of carefully curated training data in mitigating gender bias.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.18434

Country: Europe > United Kingdom (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation

Eger, Steffen, Cao, Yong, D'Souza, Jennifer, Geiger, Andreas, Greisinger, Christian, Gross, Stephanie, Hou, Yufang, Krenn, Brigitte, Lauscher, Anne, Li, Yizhi, Lin, Chenghua, Moosavi, Nafise Sadat, Zhao, Wei, Miller, Tristan

arXiv.org Artificial IntelligenceFeb-7-2025

With the advent of large multimodal language models, science is now at a threshold of an AI-based technological transformation. Recently, a plethora of new AI models and tools has been proposed, promising to empower researchers and academics worldwide to conduct their research more effectively and efficiently. This includes all aspects of the research cycle, especially (1) searching for relevant literature; (2) generating research ideas and conducting experimentation; generating (3) text-based and (4) multimodal content (e.g., scientific figures and diagrams); and (5) AI-based automatic peer review. In this survey, we provide an in-depth overview over these exciting recent developments, which promise to fundamentally alter the scientific research process for good. Our survey covers the five aspects outlined above, indicating relevant datasets, methods and results (including evaluation) as well as limitations and scope for future research. Ethical concerns regarding shortcomings of these tools and potential for misuse (fake science, plagiarism, harms to research integrity) take a particularly prominent place in our discussion. We hope that our survey will not only become a reference guide for newcomers to the field but also a catalyst for new AI-based initiatives in the area of "AI4Science".

information retrieval, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2502.05151

Country:

North America > United States > Maryland (0.27)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Europe > United Kingdom > England > Greater Manchester (0.14)
(3 more...)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)
Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Government (1.00)
Education (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(9 more...)

Add feedback

Exploring the Influence of Label Aggregation on Minority Voices: Implications for Dataset Bias and Model Training

Pandya, Mugdha, Moosavi, Nafise Sadat, Maynard, Diana

arXiv.org Artificial IntelligenceDec-5-2024

Resolving disagreement in manual annotation typically consists of removing unreliable annotators and using a label aggregation strategy such as majority vote or expert opinion to resolve disagreement. These may have the side-effect of silencing or under-representing minority but equally valid opinions. In this paper, we study the impact of standard label aggregation strategies on minority opinion representation in sexism detection. We investigate the quality and value of minority annotations, and then examine their effect on the class distributions in gold labels, as well as how this affects the behaviour of models trained on the resulting datasets. Finally, we discuss the potential biases introduced by each method and how they can be amplified by the models.

aggregation, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.04025

Country: North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Rolling the DICE on Idiomaticity: How LLMs Fail to Grasp Context

Mi, Maggie, Villavicencio, Aline, Moosavi, Nafise Sadat

arXiv.org Artificial IntelligenceOct-21-2024

Human processing of idioms relies on understanding the contextual sentences in which idioms occur, as well as language-intrinsic features such as frequency and speaker-intrinsic factors like familiarity. While LLMs have shown high performance on idiomaticity detection tasks, this success may be attributed to reasoning shortcuts in existing datasets. To this end, we construct a novel, controlled contrastive dataset designed to test whether LLMs can effectively use context to disambiguate idiomatic meaning. Additionally, we explore how collocational frequency and sentence probability influence model performance. Our findings reveal that LLMs often fail to resolve idiomaticity when it is required to attend to the surrounding context, and that models perform better on sentences that have higher likelihood. The collocational frequency of expressions also impacts performance. We make our code and dataset publicly available.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.16069

Country:

North America (1.00)
Europe (1.00)
Asia > Middle East (0.93)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.99)

Add feedback

How to Leverage Digit Embeddings to Represent Numbers?

Sivakumar, Jasivan Alex, Moosavi, Nafise Sadat

arXiv.org Artificial IntelligenceJun-30-2024

Apart from performing arithmetic operations, understanding numbers themselves is still a challenge for existing language models. Simple generalisations, such as solving 100+200 instead of 1+2, can substantially affect model performance (Sivakumar and Moosavi, 2023). Among various techniques, character-level embeddings of numbers have emerged as a promising approach to improve number representation. However, this method has limitations as it leaves the task of aggregating digit representations to the model, which lacks direct supervision for this process. In this paper, we explore the use of mathematical priors to compute aggregated digit embeddings and explicitly incorporate these aggregates into transformer models. This can be achieved either by adding a special token to the input embeddings or by introducing an additional loss function to enhance correct predictions. We evaluate the effectiveness of incorporating this explicit aggregation, analysing its strengths and shortcomings, and discuss future directions to better benefit from this approach. Our methods, while simple, are compatible with any pretrained model and require only a few lines of code, which we have made publicly available.

digit, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2407.00894

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)

Add feedback

Decoding News Narratives: A Critical Analysis of Large Language Models in Framing Detection

Pastorino, Valeria, Sivakumar, Jasivan A., Moosavi, Nafise Sadat

arXiv.org Artificial IntelligenceJun-15-2024

Previous studies on framing have relied on manual analysis or fine-tuning models with limited annotated datasets. However, pre-trained models, with their diverse training backgrounds, offer a promising alternative. This paper presents a comprehensive analysis of GPT-4, GPT-3.5 Turbo, and FLAN-T5 models in detecting framing in news headlines. We evaluated these models in various scenarios: zero-shot, few-shot with in-domain examples, cross-domain examples, and settings where models explain their predictions. Our results show that explainable predictions lead to more reliable outcomes. GPT-4 performed exceptionally well in few-shot settings but often misinterpreted emotional language as framing, highlighting a significant challenge. Additionally, the results suggest that consistent predictions across multiple models could help identify potential annotation inaccuracies in datasets. Finally, we propose a new small dataset for real-world evaluation on headlines from a diverse set of topics.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2402.11621

Country:

Europe (1.00)
North America > United States > Colorado (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Immigration & Customs (0.68)
Law (0.68)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Beyond Hate Speech: NLP's Challenges and Opportunities in Uncovering Dehumanizing Language

Zhang, Hezhao, Harris, Lasana, Moosavi, Nafise Sadat

arXiv.org Artificial IntelligenceFeb-21-2024

Dehumanization, characterized as a subtle yet harmful manifestation of hate speech, involves denying individuals of their human qualities and often results in violence against marginalized groups. Despite significant progress in Natural Language Processing across various domains, its application in detecting dehumanizing language is limited, largely due to the scarcity of publicly available annotated data for this domain. This paper evaluates the performance of cutting-edge NLP models, including GPT-4, GPT-3.5, and LLAMA-2, in identifying dehumanizing language. Our findings reveal that while these models demonstrate potential, achieving a 70\% accuracy rate in distinguishing dehumanizing language from broader hate speech, they also display biases. They are over-sensitive in classifying other forms of hate speech as dehumanization for a specific subset of target groups, while more frequently failing to identify clear cases of dehumanization for other target groups. Moreover, leveraging one of the best-performing models, we automatically annotated a larger dataset for training more accessible models. However, our findings indicate that these models currently do not meet the high-quality data generation threshold necessary for this task.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2402.13818

Country:

Europe > United Kingdom (0.46)
Asia > China (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Government > Regional Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LLMs as Narcissistic Evaluators: When Ego Inflates Evaluation Scores

Liu, Yiqi, Moosavi, Nafise Sadat, Lin, Chenghua

arXiv.org Artificial IntelligenceNov-16-2023

Automatic evaluation of generated textual content presents an ongoing challenge within the field of NLP. Given the impressive capabilities of modern language models (LMs) across diverse NLP tasks, there is a growing trend to employ these models in creating innovative evaluation metrics for automated assessment of generation tasks. This paper investigates a pivotal question: Do language model-driven evaluation metrics inherently exhibit bias favoring texts generated by the same underlying language model? Specifically, we assess whether prominent LM-based evaluation metrics--namely, BARTScore, T5Score, and GPTScore--demonstrate a favorable bias toward their respective underlying LMs in the context of summarization tasks. Our findings unveil a latent bias, particularly pronounced when such evaluation metrics are used in an reference-free manner without leveraging gold summaries. These results underscore that assessments provided by generative evaluation models can be influenced by factors beyond the inherent text quality, highlighting the necessity of developing more dependable evaluation protocols in the future.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2311.09766

Country:

Europe (1.00)
Asia (0.93)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Learning From Free-Text Human Feedback -- Collect New Datasets Or Extend Existing Ones?

Petrak, Dominic, Moosavi, Nafise Sadat, Tian, Ye, Rozanov, Nikolai, Gurevych, Iryna

arXiv.org Artificial IntelligenceOct-24-2023

Learning from free-text human feedback is essential for dialog systems, but annotated data is scarce and usually covers only a small fraction of error types known in conversational AI. Instead of collecting and annotating new datasets from scratch, recent advances in synthetic dialog generation could be used to augment existing dialog datasets with the necessary annotations. However, to assess the feasibility of such an effort, it is important to know the types and frequency of free-text human feedback included in these datasets. In this work, we investigate this question for a variety of commonly used dialog datasets, including MultiWoZ, SGD, BABI, PersonaChat, Wizards-of-Wikipedia, and the human-bot split of the Self-Feeding Chatbot. Using our observations, we derive new taxonomies for the annotation of free-text human feedback in dialogs and investigate the impact of including such data in response generation for three SOTA language generation models, including GPT-2, LLAMA, and Flan-T5. Our findings provide new insights into the composition of the datasets examined, including error types, user response types, and the relations between them.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2310.15758

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report > New Finding (1.00)

Industry:

Automobiles & Trucks > Manufacturer (1.00)
Transportation > Ground > Road (0.68)
Media (0.67)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.87)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.66)

Add feedback

Arithmetic-Based Pretraining -- Improving Numeracy of Pretrained Language Models

Petrak, Dominic, Moosavi, Nafise Sadat, Gurevych, Iryna

arXiv.org Artificial IntelligenceJun-9-2023

State-of-the-art pretrained language models tend to perform below their capabilities when applied out-of-the-box on tasks that require understanding and working with numbers. Recent work suggests two main reasons for this: (1) popular tokenisation algorithms have limited expressiveness for numbers, and (2) common pretraining objectives do not target numeracy. Approaches that address these shortcomings usually require architectural changes or pretraining from scratch. In this paper, we propose a new extended pretraining approach called Arithmetic-Based Pretraining that jointly addresses both in one extended pretraining step without requiring architectural changes or pretraining from scratch. Arithmetic-Based Pretraining combines contrastive learning to improve the number representation, and a novel extended pretraining objective called Inferable Number Prediction Task to improve numeracy. Our experiments show the effectiveness of Arithmetic-Based Pretraining in three different tasks that require improved numeracy, i.e., reading comprehension in the DROP dataset, inference-on-tables in the InfoTabs dataset, and table-to-text generation in the WikiBio and SciGen datasets.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2205.06733

Country:

Europe (0.68)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback