psychological depth
Analyzing Nobel Prize Literature with Large Language Models
Yang, Zhenyuan, Liu, Zhengliang, Zhang, Jing, Lu, Cen, Tai, Jiaxin, Zhong, Tianyang, Li, Yiwei, Zhao, Siyan, Yao, Teng, Liu, Qing, Yang, Jinlin, Liu, Qixin, Li, Zhaowei, Wang, Kexin, Ma, Longjun, Zhu, Dajiang, Ren, Yudan, Ge, Bao, Zhang, Wei, Qiang, Ning, Zhang, Tuo, Liu, Tianming
This study examines the capabilities of advanced Large Language Models (LLMs), particularly the o1 model, in the context of literary analysis. The outputs of these models are compared directly to those produced by graduate-level human participants. By focusing on two Nobel Prize-winning short stories, 'Nine Chapters' by Han Kang, the 2024 laureate, and 'Friendship' by Jon Fosse, the 2023 laureate, the research explores the extent to which AI can engage with complex literary elements such as thematic analysis, intertextuality, cultural and historical contexts, linguistic and structural innovations, and character development. Given the Nobel Prize's prestige and its emphasis on cultural, historical, and linguistic richness, applying LLMs to these works provides a deeper understanding of both human and AI approaches to interpretation. The study uses qualitative and quantitative evaluations of coherence, creativity, and fidelity to the text, revealing the strengths and limitations of AI in tasks typically reserved for human expertise. While LLMs demonstrate strong analytical capabilities, particularly in structured tasks, they often fall short in emotional nuance and coherence, areas where human interpretation excels. This research underscores the potential for human-AI collaboration in the humanities, opening new opportunities in literary studies and beyond.
- North America > United States > Georgia > Clarke County > Athens (0.14)
- Asia > China > Shaanxi Province > Xi'an (0.05)
- Asia > Taiwan (0.04)
- (8 more...)
- Research Report (1.00)
- Personal > Honors > Award (0.34)
Measuring Psychological Depth in Language Models
Harel-Canada, Fabrice, Zhou, Hanyu, Mupalla, Sreya, Yildiz, Zeynep, Sahai, Amit, Peng, Nanyun
Evaluations of creative stories generated by large language models (LLMs) often focus on objective properties of the text, such as its style, coherence, and toxicity. While these metrics are indispensable, they do not speak to a story's subjective, psychological impact from a reader's perspective. We introduce the Psychological Depth Scale (PDS), a novel framework rooted in literary theory that measures an LLM's ability to produce authentic and narratively complex stories that provoke emotion, empathy, and engagement. We empirically validate our framework by showing that humans can consistently evaluate stories based on PDS (0.72 Krippendorff's alpha). We also explore techniques for automating the PDS to easily scale future analyses. GPT-4o, combined with a novel Mixture-of-Personas (MoP) prompting strategy, achieves an average Spearman correlation of $0.51$ with human judgment while Llama-3-70B scores as high as 0.68 for empathy. Finally, we compared the depth of stories authored by both humans and LLMs. Surprisingly, GPT-4 stories either surpassed or were statistically indistinguishable from highly-rated human-written stories sourced from Reddit. By shifting the focus from text to reader, the Psychological Depth Scale is a validated, automated, and systematic means of measuring the capacity of LLMs to connect with humans through the stories they tell.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- (11 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Media > News (0.67)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.46)