Goto

Collaborating Authors

 dall


"Draw me a curator" Examining the visual stereotyping of a cultural services profession by generative AI

Spennemann, Dirk HR

arXiv.org Artificial Intelligence

Based on 230 visualisations, this paper examines the depiction of museum curators by the popular generative Artificial Intelligence (AI) model, ChatGPT4o. While the AI-generated representations do not reiterate popular stereotypes of curators as nerdy, conservative in dress and stuck in time rummaging through collections, they contrast sharply with real-world demographics. AI-generated imagery extremely underrepresents women (3.5% vs 49% to 72% in reality) and disregards ethnic communities other than Caucasian (0% vs 18% to 36%). It only over-represents young curators (79% vs approx. 27%) but also renders curators to resemble yuppie professionals or people featuring in fashion advertising. Stereotypical attributes are prevalent, with curators widely depicted as wearing beards and holding clipboards or digital tablets. The findings highlight biases in the generative AI image creation dataset, which is poised to shape an inaccurate portrayal of museum professionals if the images were to be taken uncritically at face value.


Prompt fidelity of ChatGPT4o / Dall-E3 text-to-image visualisations

Spennemann, Dirk HR

arXiv.org Artificial Intelligence

This study examines the prompt fidelity of ChatGPT4o / DALL - E3 text - to - image visualisations by analysing whether anullributes explicitly specified in autogenously generated prompts are correctly rendered in the resulting images. Using two public - domain datasets comprising 200 visualisations of women working in the cultural and creative industries and 230 visualisations of museum curators, the study assessed accuracy across personal anullributes (age, hair), appearance (anullire, glasses), and paraphernalia (name tags, clipboards). While correctly rendered in most cases, DALL - E3 deviated from prompt specifications in 15.6% of all anullributes (n=710). Errors were lowest for paraphernalia, moderate for personal appearance, and highest for depictions of the person themselves, particularly age. These findings demonstrate measurable prompt - to - image fidelity gaps with implications for bias detection and model evaluation.


Gender Stereotypes in Professional Roles Among Saudis: An Analytical Study of AI-Generated Images Using Language Models

AlKhalifah, Khaloud S., Mashaabi, Malak, Al-Khalifa, Hend

arXiv.org Artificial Intelligence

This study investigates the extent to which contemporary Text-to-Image artificial intelligence (AI) models perpetuate gender stereotypes and cultural inaccuracies when generating depictions of professionals in Saudi Arabia. We analyzed 1,006 images produced by ImageFX, DALL-E V3, and Grok for 56 diverse Saudi professions using neutral prompts. Two trained Saudi annotators evaluated each image on five dimensions: perceived gender, clothing and appearance, background and setting, activities and interactions, and age. A third senior researcher adjudicated whenever the two primary raters disagreed, yielding 10,100 individual judgements. The results reveal a strong gender imbalance, with ImageFX outputs being 85\% male, Grok 86.6\% male, and DALL-E V3 96\% male, indicating that DALL-E V3 exhibited the strongest overall gender stereotyping. This imbalance was most evident in leadership and technical roles. Moreover, cultural inaccuracies in clothing, settings, and depicted activities were frequently observed across all three models. Counter-stereotypical images often arise from cultural misinterpretations rather than genuinely progressive portrayals. We conclude that current models mirror societal biases embedded in their training data, generated by humans, offering only a limited reflection of the Saudi labour market's gender dynamics and cultural nuances. These findings underscore the urgent need for more diverse training data, fairer algorithms, and culturally sensitive evaluation frameworks to ensure equitable and authentic visual outputs.


HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models

Bakr, Eslam Mohamed, Sun, Pengzhan, Shen, Xiaoqian, Khan, Faizan Farooq, Li, Li Erran, Elhoseiny, Mohamed

arXiv.org Artificial Intelligence

In recent years, Text-to-Image (T2I) models have been extensively studied, especially with the emergence of diffusion models that achieve state-of-the-art results on T2I synthesis tasks. However, existing benchmarks heavily rely on subjective human evaluation, limiting their ability to holistically assess the model's capabilities. Furthermore, there is a significant gap between efforts in developing new T2I architectures and those in evaluation. To address this, we introduce HRS-Bench, a concrete evaluation benchmark for T2I models that is Holistic, Reliable, and Scalable. Unlike existing bench-marks that focus on limited aspects, HRS-Bench measures 13 skills that can be categorized into five major categories: accuracy, robustness, generalization, fairness, and bias. In addition, HRS-Bench covers 50 scenarios, including fashion, animals, transportation, food, and clothes. We evaluate nine recent large-scale T2I models using metrics that cover a wide range of skills. A human evaluation aligned with 95% of our evaluations on average was conducted to probe the effectiveness of HRS-Bench. Our experiments demonstrate that existing models often struggle to generate images with the desired count of objects, visual text, or grounded emotions. We hope that our benchmark help ease future text-to-image generation research. The code and data are available at https://eslambakr.github.io/hrsbench.github.io


Neural networks for learning personality traits from natural language

Adorni, Giorgia

arXiv.org Artificial Intelligence

Personality is considered one of the most influential research topics in psychology, as it predicts many consequential outcomes such as mental and physical health and explains human behaviour. With the widespread use of social networks as a means of communication, it is becoming increasingly important to develop models that can automatically and accurately read the essence of individuals based solely on their writing. In particular, the convergence of social and computer sciences has led researchers to develop automatic approaches for extracting and studying "hidden" information in textual data on the internet. The nature of this thesis project is highly experimental, and the motivation behind this work is to present detailed analyses on the topic, as currently there are no significant investigations of this kind. The objective is to identify an adequate semantic space that allows for defining the personality of the object to which a certain text refers. The starting point is a dictionary of adjectives that psychological literature defines as markers of the five major personality traits, or Big Five. In this work, we started with the implementation of fully-connected neural networks as a basis for understanding how simple deep learning models can provide information on hidden personality characteristics. Finally, we use a class of distributional algorithms invented in 2013 by Tomas Mikolov, which consists of using a convolutional neural network that learns the contexts of words in an unsupervised way. In this way, we construct an embedding that contains the semantic information on the text, obtaining a kind of "geometry of meaning" in which concepts are translated into linear relationships. With this last experiment, we hypothesize that an individual writing style is largely coupled with their personality traits.


2022年のデータサイエンス、機械学習、AI、アナリティクスの主要な進展(2/2)

#artificialintelligence

1.2022年のデータサイエンス、機械学習、AI、アナリティクスの主要な進展(2/2)まとめ・ローコード/ノーコードのデータサイエンス・プラットフォームの採用が進む・ジェネレーティブアートは言語を使って高品質なアート生成を可能にした・Cha


Generative AI (1/2): the new wave of AI is coming

#artificialintelligence

While everybody was focused on crypto and web3 during the last two years, behind the scenes something that might have an impact of the same magnitude on the web and perhaps even more was rooting: Generative AI. But it's during the past months that everything seemed to accelerate. It's like every hope we had for AI in the last 20 years has come 10x closer to reality in a matter of weeks. This article is the first of a series of two medium posts regarding Generative AI. Today I'll focus on explaining what it is, how it works, how it emerged and what could be the underlying use cases.


Four AI trends to watch in 2023

#artificialintelligence

The launch of ChatGPT and GPT 3.5 (Generative Progressive Transformer-3.5) -- which many claim will herald a new era in dialogue-based conversational AI -- has ended the year on a high for conversational AI. People are using ChatGPT for tasks ranging from correcting code errors to rewriting the Bohemian Rhapsody and the number of ChatGPT users surpassed the million mark in less than a week last month. While 2022 was about newer and more advanced tools and models, commercial use cases, regulation, and standardisation of AI are expected to define 2023 for this domain. Here's what to expect from the AI industry in 2023. Generative AI, which is artificial intelligence that can create text, images, videos etc. without supervision, set the tone for this year and the trend will spill on to 2023 as well.


Futuristic cars according to artificial intelligence

#artificialintelligence

These are the craziest, most futuristic cars created by artificial intelligence tool, DALL.E. DALL.E is a new AI system that can artificially draw some of the most realistic images based on whatever keywords you feed it. We plugged in some crazy search terms, like'armored Ferrari', 'off-road Bugatti', and'futuristic flying car', and this is what it came up with. While some are super realistic, some are just out-of-this-world bizarre. First, we asked DALL.E to show us what it believes the future of flying cars will look like.


Can Text-to-Image AI Learn Ethics --or Is the Future Doomed?

#artificialintelligence

Text-to-image AI generation tools have entered their wild wild west phase. The sweeping trend which Open AI's DALL.E 2 started with great caution has drastically turned into a world where anything goes. Last week, London and Los Altos-based startup Stability.ai Comparable in quality to DALL.E 2 and Midjourney, the implications of the step taken by Stability.ai Moreover, Stable Diffusion, unlike its predecessors, has next to no restrictions barring users from generating images with inappropriate content or prominent personalities.