profession
I Teach Middle Schoolers. I'm Seeing Something in the Kids That's Getting Worse Every Year.
Good Job is Slate's advice column on work. Have a workplace problem big or small? I have been an eighth-grade teacher for seven years now and am beginning to think I made a terrible mistake in terms of choosing my profession. The kids I teach are rude and feral. They refuse to read or treat others with the slightest bit of decency, give up at the first sign of difficulty, and possess the attention span of goldfish.
Investigating Gender Bias in Language Models Using Causal Mediation Analysis Jesse Vig 1 Sebastian Gehrmann *2 Sharon Qian 2
Many interpretation methods for neural models in natural language processing investigate how information is encoded inside hidden representations. However, these methods can only measure whether the information exists, not whether it is actually used by the model. We propose a methodology grounded in the theory of causal mediation analysis for interpreting which parts of a model are causally implicated in its behavior. The approach enables us to analyze the mechanisms that facilitate the flow of information from input to output through various model components, known as mediators. As a case study, we apply this methodology to analyzing gender bias in pre-trained Transformer language models. We study the role of individual neurons and attention heads in mediating gender bias across three datasets designed to gauge a model's sensitivity to gender bias. Our mediation analysis reveals that gender bias effects are concentrated in specific components of the model that may exhibit highly specialized behavior.
Teacher quits profession after viral rant on how AI is 'ruining' education
Hannah, a former teacher, joins'Fox & Friends' to explain why she left the classroom, saying AI tools are making it difficult to teach. A former high school English teacher went viral this week after posting a candid video on social media announcing she was quitting the teaching profession because of how technology was "ruining" education. In her video, which reached over 1 million views on TikTok, Hannah explained how AI tools have made teaching more difficult because students rely on technology to do the work for them and are unmotivated to put in effort themselves. She said that kids do not know how to read because of read-aloud tools, and have short attention spans because of the "high stimulation" of social media. "They want to use [technology] for entertainment. They don't want to use it for education," she said in a TikTok video which reached over 1 million views.
From Structured Prompts to Open Narratives: Measuring Gender Bias in LLMs Through Open-Ended Storytelling
Chen, Evan, Zhan, Run-Jun, Lin, Yan-Bai, Chen, Hung-Hsuan
Large Language Models (LLMs) have revolutionized natural language processing, yet concerns persist regarding their tendency to reflect or amplify social biases present in their training data. This study introduces a novel evaluation framework to uncover gender biases in LLMs, focusing on their occupational narratives. Unlike previous methods relying on structured scenarios or carefully crafted prompts, our approach leverages free-form storytelling to reveal biases embedded in the models. Systematic analyses show an overrepresentation of female characters across occupations in six widely used LLMs. Additionally, our findings reveal that LLM-generated occupational gender rankings align more closely with human stereotypes than actual labor statistics. These insights underscore the need for balanced mitigation strategies to ensure fairness while avoiding the reinforcement of new stereotypes.
EmpathyAgent: Can Embodied Agents Conduct Empathetic Actions?
Chen, Xinyan, Ge, Jiaxin, Dai, Hongming, Zhou, Qiang, Feng, Qiuxuan, Hu, Jingtong, Wang, Yizhou, Liu, Jiaming, Zhang, Shanghang
Empathy is fundamental to human interactions, yet it remains unclear whether embodied agents can provide human-like empathetic support. Existing works have studied agents' tasks solving and social interactions abilities, but whether agents can understand empathetic needs and conduct empathetic behaviors remains overlooked. To address this, we introduce EmpathyAgent, the first benchmark to evaluate and enhance agents' empathetic actions across diverse scenarios. EmpathyAgent contains 10,000 multimodal samples with corresponding empathetic task plans and three different challenges. To systematically evaluate the agents' empathetic actions, we propose an empathy-specific evaluation suite that evaluates the agents' empathy process. We benchmark current models and found that exhibiting empathetic actions remains a significant challenge. Meanwhile, we train Llama3-8B using EmpathyAgent and find it can potentially enhance empathetic behavior. By establishing a standard benchmark for evaluating empathetic actions, we hope to advance research in empathetic embodied agents. Our code and data are publicly available at https://github.com/xinyan-cxy/EmpathyAgent.
No LLM is Free From Bias: A Comprehensive Study of Bias Evaluation in Large Language models
Kumar, Charaka Vinayak, Urlana, Ashok, Kanumolu, Gopichand, Garlapati, Bala Mallikarjunarao, Mishra, Pruthwik
Advancements in Large Language Models (LLMs) have increased the performance of different natural language understanding as well as generation tasks. Although LLMs have breached the state-of-the-art performance in various tasks, they often reflect different forms of bias present in the training data. In the light of this perceived limitation, we provide a unified evaluation of benchmarks using a set of representative LLMs that cover different forms of biases starting from physical characteristics to socio-economic categories. Moreover, we propose five prompting approaches to carry out the bias detection task across different aspects of bias. Further, we formulate three research questions to gain valuable insight in detecting biases in LLMs using different approaches and evaluation metrics across benchmarks. The results indicate that each of the selected LLMs suffer from one or the other form of bias with the LLaMA3.1-8B model being the least biased. Finally, we conclude the paper with the identification of key challenges and possible future directions.