Goto

Collaborating Authors

 rater








Beyond Top Activations: Efficient and Reliable Crowdsourced Evaluation of Automated Interpretability

Oikarinen, Tuomas, Yan, Ge, Kulkarni, Akshay, Weng, Tsui-Wei

arXiv.org Artificial Intelligence

Interpreting individual neurons or directions in activation space is an important topic in mechanistic interpretability. Numerous automated interpretability methods have been proposed to generate such explanations, but it remains unclear how reliable these explanations are, and which methods produce the most accurate descriptions. While crowd-sourced evaluations are commonly used, existing pipelines are noisy, costly, and typically assess only the highest-activating inputs, leading to unreliable results. In this paper, we introduce two techniques to enable cost-effective and accurate crowdsourced evaluation of automated interpretability methods beyond top activating inputs. First, we propose Model-Guided Importance Sampling (MG-IS) to select the most informative inputs to show human raters. In our experiments, we show this reduces the number of inputs needed to reach the same evaluation accuracy by ~13x. Second, we address label noise in crowd-sourced ratings through Bayesian Rating Aggregation (BRAgg), which allows us to reduce the number of ratings per input required to overcome noise by ~3x. Together, these techniques reduce the evaluation cost by ~40x, making large-scale evaluation feasible. Finally, we use our methods to conduct a large scale crowd-sourced study comparing recent automated interpretability methods for vision networks.


Stable diffusion models reveal a persisting human and AI gap in visual creativity

Rondini, Silvia, Alvarez-Martin, Claudia, Angermair-Barkai, Paula, Penacchio, Olivier, Paz, M., Pelowski, Matthew, Dediu, Dan, Rodriguez-Fornells, Antoni, Cerda-Company, Xim

arXiv.org Artificial Intelligence

While recent research suggests Large Language Models match human creative performance in divergent thinking tasks, visual creativity remains underexplored. This study compared image generation in human participants (Visual Artists and Non Artists) and using an image generation AI model (two prompting conditions with varying human input: high for Human Inspired, low for Self Guided). Human raters (N=255) and GPT4o evaluated the creativity of the resulting images. We found a clear creativity gradient, with Visual Artists being the most creative, followed by Non Artists, then Human Inspired generative AI, and finally Self Guided generative AI. Increased human guidance strongly improved GenAI's creative output, bringing its productions close to those of Non Artists. Notably, human and AI raters also showed vastly different creativity judgment patterns. These results suggest that, in contrast to language centered tasks, GenAI models may face unique challenges in visual domains, where creativity depends on perceptual nuance and contextual sensitivity, distinctly human capacities that may not be readily transferable from language models.


Meet the AI workers who tell their friends and family to stay away from AI

The Guardian

AI workers said they distrust the models they work on because of a consistent emphasis on rapid turnaround time at the expense of quality. AI workers said they distrust the models they work on because of a consistent emphasis on rapid turnaround time at the expense of quality. K rista Pawloski remembers the single defining moment that shaped her opinion on the ethics of artificial intelligence . As an AI worker on Amazon Mechanical Turk - a marketplace that allows companies to hire workers to perform tasks like entering data or matching an AI prompt with its output - Pawloski spends her time moderating and assessing the quality of AI-generated text, images and videos, as well as some factchecking. Roughly two years ago, while working from home at her dining room table, she took up a job designating tweets as racist or not. When she was presented with a tweet that read "Listen to that mooncricket sing", she almost clicked on the "no" button before deciding to check the meaning of the word "mooncricket", which, to her surprise, was a racial slur against Black Americans.