Media
ViSAGe: Video-to-Spatial Audio Generation
Kim, Jaeyeon, Yun, Heeseung, Kim, Gunhee
Spatial audio is essential for enhancing the immersiveness of audio-visual experiences, yet its production typically demands complex recording systems and specialized expertise. In this work, we address a novel problem of generating first-order ambisonics, a widely used spatial audio format, directly from silent videos. To support this task, we introduce YT-Ambigen, a dataset comprising 102K 5-second YouTube video clips paired with corresponding first-order ambisonics. We also propose new evaluation metrics to assess the spatial aspect of generated audio based on audio energy maps and saliency metrics. Furthermore, we present Video-to-Spatial Audio Generation (ViSAGe), an end-to-end framework that generates first-order ambisonics from silent video frames by leveraging CLIP visual features, autoregressive neural audio codec modeling with both directional and visual guidance. Experimental results demonstrate that ViSAGe produces plausible and coherent first-order ambisonics, outperforming two-stage approaches consisting of video-to-audio generation and audio spatialization. Qualitative examples further illustrate that ViSAGe generates temporally aligned high-quality spatial audio that adapts to viewpoint changes.
The Amazon Nova Family of Models: Technical Report and Model Card
AGI, Amazon, Langford, Aaron, Shah, Aayush, Gupta, Abhanshu, Bhatter, Abhimanyu, Goyal, Abhinav, Mathur, Abhinav, Mohanty, Abhinav, Kumar, Abhishek, Sethi, Abhishek, Komma, Abi, Pena, Abner, Jain, Achin, Kunysz, Adam, Opyrchal, Adam, Singh, Adarsh, Rawal, Aditya, Prasad, Adok Achar Budihal, de Gispert, Adriร , Kumar, Agnika, Aryamane, Aishwarya, Nair, Ajay, M, Akilan, Iyengar, Akshaya, Shanbhogue, Akshaya Vishnu Kudlu, He, Alan, Cervone, Alessandra, Loeb, Alex, Zhang, Alex, Fu, Alexander, Lisnichenko, Alexander, Zhipa, Alexander, Potamianos, Alexandros, Kebarighotbi, Ali, Daronkolaei, Aliakbar, Parmesh, Alok, Samra, Amanjot Kaur, Khan, Ameen, Rez, Amer, Saffari, Amir, Agarwalla, Amit, Jhindal, Amit, Mamidala, Amith, Asmro, Ammar, Ballakur, Amulya, Mishra, Anand, Sridharan, Anand, Dubinina, Anastasiia, Lenz, Andre, Doerr, Andreas, Keating, Andrew, Leaver, Andrew, Smith, Andrew, Wirth, Andrew, Davey, Andy, Rosenbaum, Andy, Sohn, Andy, Chan, Angela, Chakrabarti, Aniket, Ramakrishna, Anil, Roy, Anirban, Iyer, Anita, Narayan-Chen, Anjali, Yennu, Ankith, Dabrowska, Anna, Gawlowska, Anna, Rumshisky, Anna, Turek, Anna, Deoras, Anoop, Bezruchkin, Anton, Prasad, Anup, Dewan, Anupam, Kiran, Anwith, Gupta, Apoorv, Galstyan, Aram, Manoharan, Aravind, Biswas, Arijit, Mandal, Arindam, Gupta, Arpit, Pathan, Arsamkhan, Nagarajan, Arun, Rajasekaram, Arushan, Sundararajan, Arvind, Ganesan, Ashwin, Swaminathan, Ashwin, Mouchtaris, Athanasios, Champeau, Audrey, Ray, Avik, Jaiswal, Ayush, Sharma, Ayush, Keefer, Bailey, Muthiah, Balamurugan, Leon-Millan, Beatriz, Koopman, Ben, Li, Ben, Biggs, Benjamin, Ott, Benjamin, Vinzamuri, Bhanu, Venkatesh, Bharath, Ganesh, Bhavana, Vasani, Bhoomit, Byrne, Bill, Hsu, Bill, Wang, Bincheng, King, Blake, Gorny, Blazej, Feng, Bo, Zheng, Bo, Paul, Bodhisattwa, Sun, Bofan, Luo, Bofeng, Chen, Bowen, Xie, Bowen, Yu, Boya, Jugan, Brendan, Panosh, Brett, Collins, Brian, Thompson, Brian, Karakus, Can, Liu, Can, Lambrecht, Carl, Lin, Carly, Wang, Carolyn, Yuan, Carrie, Loyda, Casey, Walczak, Cezary, Choppa, Chalapathi, Prakash, Chandana Satya, Meas, Chankrisna Richy, Peris, Charith, Recaido, Charles, Xu, Charlie, Sharma, Charul, Kernan, Chase, Thanapirom, Chayut, Su, Chengwei, Xu, Chenhao, Yin, Chenhao, Ye, Chentao, Tao, Chenyang, Parameshwara, Chethan, Chang, Ching-Yun, Li, Chong, Hench, Chris, Tran, Chris, Dupuy, Christophe, Davis, Christopher, DiPersio, Christopher, Christodoulopoulos, Christos, Li, Christy, Chen, Chun, Bovi, Claudio Delli, Chung, Clement, Hawkins, Cole, Harris, Connor, Ropell, Corey, He, Cynthia, Joo, DK, Hwang, Dae Yon, Rosen, Dan, Elkind, Daniel, Pressel, Daniel, Zhang, Daniel, Kimball, Danielle, Sorokin, Daniil, Goodell, Dave, Modolo, Davide, Zhu, Dawei, Suresh, Deepikaa, Ragha, Deepti, Filimonov, Denis, Kune, Denis Foo, Rodriguez, Denis Romasanta, Hazarika, Devamanyu, Ram, Dhananjay, Parkar, Dhawal, Patel, Dhawal, Desai, Dhwanil, Rajput, Dinesh Singh, Sule, Disha, Singh, Diwakar, Genzel, Dmitriy, Goldenberg, Dolly, He, Dongyi, Hanciu, Dumitru, Tharmal, Dushan, Siankovich, Dzmitry, Cikovic, Edi, Abraham, Edwin, Sabir, Ekraam, Olson, Elliott, Steven, Emmett, Barut, Emre, Jackson, Eric, Wu, Ethan, Chen, Evelyn, Mahalingam, Ezhilan, Triefenbach, Fabian, Yang, Fan, Liu, Fangyu, Wu, Fanzi, Tavakoli, Faraz, Khozeimeh, Farhad, Niu, Feiyang, Hieber, Felix, Li, Feng, Elbey, Firat, Krebs, Florian, Saupe, Florian, Sprรผnken, Florian, Fan, Frank, Khan, Furqan, De Vincenzo, Gabriela, Kang, Gagandeep, Ding, George, He, George, Yeung, George, Qaddoumi, Ghada, Karamanolakis, Giannis, Huybrechts, Goeric, Maddali, Gokul, Iglesias, Gonzalo, McShane, Gordon, Sahin, Gozde, Huang, Guangtai, Kwon, Gukyeong, Sigurdsson, Gunnar A., Chadha, Gurpreet, Kosuru, Gururaj, Fuerstenau, Hagen, Hah, Hah, Maideen, Haja, Hosokawa, Hajime, Liu, Han, Hsu, Han-Kai, Wang, Hann, Li, Hao, Yang, Hao, Zhu, Haofeng, Fan, Haozheng, Singh, Harman, Kaluvala, Harshavardhan, Saeed, Hashim, Xie, He, Feng, Helian, Luo, Hendrix, Pei, Hengzhi, Nielsen, Henrik, Ilati, Hesam, Patel, Himanshu, Li, Hongshan, Lin, Hongzhou, Raza, Hussain, Cullinan, Ian, Kiss, Imre, Thangamani, Inbarasan, Fadnavis, Indrayani, Sorodoc, Ionut Teodor, Ertuerk, Irem, Yemialyanava, Iryna, Soni, Ishan, Jelal, Ismail, Tse, Ivan, FitzGerald, Jack, Zhao, Jack, Rothgeb, Jackson, Lee, Jacky, Jung, Jake, Debski, Jakub, Tomczak, Jakub, Jeun, James, Sanders, James, Crowley, Jason, Lee, Jay, Paidy, Jayakrishna Anvesh, Tiwari, Jayant, Farmer, Jean, Solinsky, Jeff, Lau, Jenna, Savareese, Jeremy, Zagorski, Jerzy, Dai, Ji, Jiacheng, null, Gu, null, Li, Jiahui, Jian, null, Zheng, null, Lu, Jianhua, Wang, Jianhua, Dai, Jiawei, Mo, Jiawei, Xu, Jiaxi, Liang, Jie, Yang, Jie, Logan, Jim, Majmudar, Jimit, Liu, Jing, Miao, Jinghong, Yi, Jingru, Jin, Jingyang, Kao, Jiun-Yu, Wang, Jixuan, Wang, Jiyang, Pemberton, Joe, Carlson, Joel, Blundell, Joey, Chin-Jew, John, He, John, Ho, Jonathan, Hueser, Jonathan, Lunt, Jonathan, Lee, Jooyoung, Tan, Joshua, Chatterjee, Joyjit, Gaspers, Judith, Wang, Jue, Fang, Jun, Tang, Jun, Wan, Jun, Wu, Jun, Wang, Junlei, Shi, Junyi, Chiu, Justin, Satriano, Justin, Yee, Justin, Dhamala, Jwala, Bansal, Jyoti, Zhen, Kai, Chang, Kai-Wei, Lin, Kaixiang, Raman, Kalyan, Sathyendra, Kanthashree Mysore, Moroe, Karabo, Bhandarkar, Karan, Kothari, Karan, Owczarzak, Karolina, Gopalswamy, Karthick, Ravi, Karthick, Ramakrishnan, Karthik, Arumugam, Karthika, Mehta, Kartik, Konczalska, Katarzyna, Ravikumar, Kavya, Tran, Ke, Qin, Kechen, Li, Kelin, Li, Kelvin, Kulkarni, Ketan, Rodrigues, Kevin Angelo, Patel, Keyur, Abboud, Khadige, Hajebi, Kiana, Reiter, Klaus, Schultz, Kris, Anisetty, Krishna, Kotnana, Krishna, Li, Kristen, Channamallikarjuna, Kruthi, Jakubczyk, Krzysztof, Pierewoj, Kuba, Pal, Kunal, Srivastav, Kunwar, Bannerman, Kyle, Poddar, Lahari, Prasad, Lakshmi, Tseng, Larry, Naik, Laxmikant, Vankadara, Leena Chennuru, Minorics, Lenon, Liu, Leo, Lausen, Leonard, Ribeiro, Leonardo F. R., Zhang, Li, Gehorsam, Lili, Qi, Ling, Bauer, Lisa, Knapp, Lori, Zeng, Lu, Tong, Lucas, Wong, Lulu, Chen, Luoxin, Rudnicki, Maciej, Namazifar, Mahdi, Jaliminche, Mahesh, Tanke, Maira Ladeira, Gupta, Manasi, Ahlawat, Mandeep, Khanuja, Mani, Sundaram, Mani, Leyk, Marcin, Momotko, Mariusz, Boese, Markus, Dreyer, Markus, Mueller, Markus, Fu, Mason, Gรณrski, Mateusz, Mastalerczyk, Mateusz, Mora, Matias, Johnson, Matt, Scott, Matt, Wen, Matthew, Barysau, Max, Boumerdassi, Maya, Krishnan, Maya, Gupta, Mayank, Hirani, Mayank, Kulkarni, Mayank, Narayanasamy, Meganathan, Bradford, Melanie, Gens, Melanie, Burke, Melissa, Jin, Meng, Chen, Miao, Denkowski, Michael, Heymel, Michael, Krestyaninov, Michael, Obirek, Michal, Wichorowska, Michalina, Miotk, Michaล, Watroba, Milosz, Hong, Mingyi, Yu, Mingzhi, Liu, Miranda, Gouda, Mohamed, El-Shabani, Mohammad, Ghavamzadeh, Mohammad, Bansal, Mohit, Ziyadi, Morteza, Xia, Nan, Susanj, Nathan, Bhasin, Nav, Goswami, Neha, Belgamwar, Nehal, Anastassacos, Nicolas, Bergeron, Nicolas, Jain, Nidhi, Jain, Nihal, Chopparapu, Niharika, Xu, Nik, Strom, Nikko, Malandrakis, Nikolaos, Mishra, Nimisha, Parkhi, Ninad, Mehrabi, Ninareh, Sant, Nishita, Gupta, Nishtha, Sekhar, Nitesh, Rajeev, Nithin, Chidambaram, Nithish Raja, Dhar, Nitish, Bhagwagar, Noor, Konforty, Noy, Babu, Omar, Razavi, Omid, Majumder, Orchid, Dar, Osama, Hsu, Oscar, Kvitca, Pablo, Pandey, Pallavi, Seegmiller, Parker, Lange, Patrick, Ferraro, Paul, Motwani, Payal, Kharazmi, Pegah, Wang, Pei, Liu, Pengfei, Bradtke, Peter, Gรถtz, Peter, Zhou, Peter, Wang, Pichao, Poskart, Piotr, Sonawane, Pooja, Natarajan, Pradeep, Ramadorai, Pradyun, Shah, Pralam, Nirantar, Prasad, Chavali, Prasanthi, Wanigasekara, Prashan, Saraf, Prashant, Dey, Prashun, Pant, Pratyush, Pradhan, Prerak, Patel, Preyaa, Dadlani, Priyanka, Sadha, Prudhvee Narasimha, Dong, Qi, Hu, Qian, Qiaozi, null, Gao, null, Liu, Qing, Lam, Quinn, Do, Quynh, Manmatha, R., Willis, Rachel, Liu, Rafael, Ellert, Rafal, Kalinski, Rafal, Attrach, Rafi Al, Prasad, Ragha, Prasad, Ragini, Kunani, Raguvir, Gupta, Rahul, Sharma, Rahul, Tewari, Rahul, Baskaran, Rajaganesh, Singh, Rajan, Gupta, Rajiv, Reddy, Rajiv, Das, Rajshekhar, Chada, Rakesh, Mahesh, Rakesh Vaideeswaran, Chandrasekaran, Ram, Nallapati, Ramesh, Xue, Ran, Gangadharaiah, Rashmi, Rachakonda, Ravi, Zhang, Renxian, Blloshmi, Rexhina, Agrawal, Rishabh, Enyedi, Robert, Lowe, Robert, Shrestha, Robik, Piramuthu, Robinson, Asad, Rohail, Khanna, Rohan, Mukherjee, Rohan, Mittal, Rohit, Prasad, Rohit, Kumar, Rohith Mysore Vijaya, Diamant, Ron, Gupta, Ruchita, Li, Ruiwen, Li, Ruoying, Fegade, Rushabh, Zhang, Ruxu, Arbow, Ryan, Chen, Ryan, Gabbard, Ryan, Hoium, Ryan, King, Ryan, Iyer, Sabarishkumar, Malick, Sachal, Movaghati, Sahar, Balakavi, Sai, Jakka, Sai, Paruvelli, Sai Kashyap, Jayanthi, Sai Muralidhar, Mujumdar, Saicharan Shriram, Kapoor, Sainyam, Beygi, Sajjad, Dingliwal, Saket, Soltan, Saleh, Ricklin, Sam, Tucker, Sam, Sinha, Sameer, Choudhary, Samridhi, Tan, Samson, Broscheit, Samuel, Schulter, Samuel, Agarwal, Sanchit, Atluri, Sandeep, Valstar, Sander, Shankar, Sanjana, Sanyukta, Sanyukta, Khanna, Sarthak, Khetrapal, Sarvpriye, Janakiraman, Satish, Shah, Saumil, Akolkar, Saurabh, Giri, Saurabh, Khandelwal, Saurabh, Pawar, Saurabh, Sahu, Saurabh, Huang, Sean, Ra, Sejun, Gopal, Senthilkumar, Dobroshinsky, Sergei, Saba, Shadi, Roy, Shamik, Lal, Shamit, Ananthakrishnan, Shankar, Li, Sharon, Srijan, Shashwat, Bhide, Shekhar, Tang, Sheng Long, Zha, Sheng, Oraby, Shereen, Mostafa, Sherif, Li, Shiqi, Bharathi, Shishir, Prakash, Shivam, Huang, Shiyuan, Yembarwar, Shreya, Pansare, Shreyas, Subramanian, Shreyas, Joshi, Shrijeet, Liu, Shuai, Tang, Shuai, Chandak, Shubham, Garg, Shubham, Katiyar, Shubham, Mehta, Shubham, Srivastav, Shubham, Yang, Shuo, S, Siddalingesha D, Choudhary, Siddharth, Senger, Siddharth Singh, Babb, Simon, Moeini, Sina, Deng, Siqi, Loganathan, Siva, Domagala, Slawomir, Narkar, Sneha, Wadhwa, Sneha, Zhang, Songyang, Jiang, Songyao, Trenous, Sony, Sarkar, Soumajyoti, Saha, Soumya, Reddy, Sourabh, Dokania, Sourav, Sandiri, Spurthideepika, Matsoukas, Spyros, Bodapati, Sravan, Wdaru, Sri Harsha Reddy, Venkateshdatta, Sridevi Yagati, Ronanki, Srikanth, Veeravanallur, Srinivasan R, Venkatapathy, Sriram, Sankaraguru, Sriramprabhu, Gorantla, Sruthi, Karuturi, Sruthi, Schroedl, Stefan, Rongali, Subendhu, Kundu, Subhasis, Shakiah, Suhaila, Tiwari, Sukriti, Bharti, Sumit, Sami, Sumita, Mathew, Sumith, Yu, Sunny, Kim, Sunwoo, Malode, Suraj Bajirao, Riel, Susana Cumplido, Palod, Swapnil, Roy, Swastik, Furqhan, Syed, Chung, Tagyoung, Yoshitani, Takuma, Yang, Taojiannan, Chillakura, Tejaswi, Bajwa, Tejwant, Lajumoke, Temi, Tran, Thanh, Gueudre, Thomas, Jung, Thomas, Li, Tianhui, Seemman, Tim, Leffel, Timothy, Xiang, Tingting, Patel, Tirth, Domhan, Tobias, Falke, Tobias, Guo, Toby, Li, Tom, Horszczaruk, Tomasz, Jedynak, Tomasz, Kulkarni, Tushar, Marin, Tyst, Metrycki, Tytus, Wang, Tzu-Yen, Jain, Umang, Singh, Upendra, Chirimar, Utkarsh, Gupta, Vaibhav, Shah, Vanshil, Deshpande, Varad, Gunjal, Varad, Srikeshava, Varsha, Vivek, Varsha, Bharadwaj, Varun, Gangal, Varun, Kumar, Varun, Elango, Venkatesh, Ordonez, Vicente, Soto, Victor, Radhakrishnan, Vignesh, Patel, Vihang, Singh, Vikram, Kolanuvada, Vinay Varma, Kumar, Vinayshekhar Bannihatti, Auvray, Vincent, Cartillier, Vincent, Ponzo, Vincent, Peng, Violet, Khandelwal, Vishal, Naik, Vishal, Sahasrabudhe, Vishvesh, Korolev, Vitaliy, Gokuladas, Vivek, Madan, Vivek, Subramanian, Vivek, Cevher, Volkan, Gupta, Vrinda, Hamza, Wael, Zhang, Wei, Ruan, Weitong, Cheng, Weiwei, Zhang, Wen, Zhao, Wenbo, Yao, Wenyan, Ouyang, Wenzhuo, Dashner, Wesley, Campbell, William, Lin, William, Martin, Willian, Pearson, Wyatt, Jiang, Xiang, Lu, Xiangxing, Shi, Xiangyang, Peng, Xianwen, Gao, Xiaofeng, Jiang, Xiaoge, Fei, Xiaohan, Wang, Xiaohui, Zhou, Xiaozhou Joey, Feng, Xin, Zhao, Xinyan, Wang, Xinyao, Li, Xinyu, Zhang, Xu, Wang, Xuan, Fu, Xuandi, Yuan, Xueling, Wang, Xuning, Rao, Yadunandana, Tavizon, Yair, Rossiytsev, Yan, Chen, Yanbei, Liu, Yang, Zou, Yang, Park, Yangsook, Versley, Yannick, Zhang, Yanyan, Patel, Yash, Lu, Yen-Cheng, Pan, Yi, Yi-Hsiang, null, Lai, null, Hu, Yichen, Wang, Yida, Zhou, Yiheng, Xiang, Yilin, Shi, Ying, Wang, Ying, Galatzer, Yishai, Wang, Yongxin, Shen, Yorick, Sun, Yuchen, Purwatama, Yudi, Yue, null, Wu, null, Gu, Yue, Wang, Yuechun, Zeng, Yujun, Chen, Yuncong, Zhou, Yunke, Xie, Yusheng, Guy, Yvon, Ambrozinski, Zbigniew, Cai, Zhaowei, Zhang, Zhen, Wang, Zheng, Jin, Zhenghui, Zhao, Zhewei, Li, Zhiheng, Luo, Zhiheng, Zhang, Zhikang, Fang, Zhilin, Bu, Zhiqi, Wang, Zhiyuan, Li, Zhizhong, Wang, Zijian, Zimeng, null, Qiu, null, Li, Zishi
We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents and text. Amazon Nova Micro is a text-only model that delivers our lowest-latency responses at very low cost. Amazon Nova Canvas is an image generation model that creates professional grade images with rich customization controls. Amazon Nova Reel is a video generation model offering high-quality outputs, customization, and motion control. Our models were built responsibly and with a commitment to customer trust, security, and reliability. We report benchmarking results for core capabilities, agentic performance, long context, functional adaptation, runtime performance, and human evaluation.
TuneGenie: Reasoning-based LLM agents for preferential music generation
Pandey, Amitesh, Arifdjanov, Jafarbek, Tiwari, Ansh
Recently, Large language models (LLMs) have shown great promise across a diversity of tasks, ranging from generating images to reasoning spatially. Considering their remarkable (and growing) textual reasoning capabilities, we investigate LLMs' potency in conducting analyses of an individual's preferences in music (based on playlist metadata, personal write-ups, etc.) and producing effective prompts (based on these analyses) to be passed to Suno AI (a generative AI tool for music production). Our proposition of a novel LLM-based textual representation to music model (which we call TuneGenie) and the various methods we develop to evaluate & benchmark similar models add to the increasing (and increasingly controversial) corpus of research on the use of AI in generating art.
Artificial Intelligence and Civil Discourse: How LLMs Moderate Climate Change Conversations
These authors contributed equally to this work. Abstract --As Large Language Models (LLMs) become increasingly integrated into online platforms and digital communication spaces, their potential to influence public discourse--particularly in contentious domains like climate change--demands systematic investigation. This study examines how LLMs naturally moderate climate change conversations through their distinct communicative behaviors, offering insights into their role as facilitators of civil discourse. We conducted a comparative analysis of conversational patterns between LLMs and human participants in climate change discussions across social media platforms. Our investigation employed five state-of-the-art models: three open-source LLMs (Gemma, Llama 3, and Llama 3.3) and two commercial systems (GPT -4o by OpenAI and Claude 3.5 by Anthropic). Through sentiment analysis, we assessed the emotional characteristics and discourse patterns exhibited by both LLMs and human users. Our findings reveal two key mechanisms through which LLMs moderate climate change conversations: First, LLMs consistently demonstrate emotional neutrality, with their responses significantly dominated by neutral sentiment compared to human participants who exhibit more polarized emotional expressions. Second, LLMs maintain notably lower emotional intensity across all interaction contexts, creating a stabilizing effect on conversational dynamics. These results suggest that LLMs possess inherent moderating capabilities that could enhance the quality of public discourse on controversial topics. By maintaining emotional equilibrium and reducing inflammatory rhetoric, LLMs may serve as valuable tools for fostering more constructive and civil climate change conversations online. This research contributes to our understanding of AI's potential role in improving digital discourse and offers implications for the design of AI-mediated communication platforms.
Do Music Preferences Reflect Cultural Values? A Cross-National Analysis Using Music Embedding and World Values Survey
This study explores the extent to which national music preferences reflect underlying cultural values. We collected long-term popular music data from YouTube Music Charts across 62 countries, encompassing both Western and non-Western regions, and extracted audio embeddings using the CLAP model. To complement these quantitative representations, we generated semantic captions for each track using LP-MusicCaps and GPT-based summarization. Countries were clustered based on contrastive embeddings that highlight deviations from global musical norms. The resulting clusters were projected into a two-dimensional space via t-SNE for visualization and evaluated against cultural zones defined by the World Values Survey (WVS). Statistical analyses, including MANOVA and chi-squared tests, confirmed that music-based clusters exhibit significant alignment with established cultural groupings. Furthermore, residual analysis revealed consistent patterns of overrepresentation, suggesting non-random associations between specific clusters and cultural zones. These findings indicate that national-level music preferences encode meaningful cultural signals and can serve as a proxy for understanding global cultural boundaries.
SPOT: Bridging Natural Language and Geospatial Search for Investigative Journalists
Khellaf, Lynn, Schlicht, Ipek Baris, Mirass, Tilman, Bayer, Julia, Wagner, Tilman, Bouwmeester, Ruben
OpenStreetMap (OSM) is a vital resource for investigative journalists doing geolocation verification. However, existing tools to query OSM data such as Overpass Turbo require familiarity with complex query languages, creating barriers for non-technical users. We present SPOT, an open source natural language interface that makes OSM's rich, tag-based geographic data more accessible through intuitive scene descriptions. SPOT interprets user inputs as structured representations of geospatial object configurations using fine-tuned Large Language Models (LLMs), with results being displayed in an interactive map interface. While more general geospatial search tasks are conceivable, SPOT is specifically designed for use in investigative journalism, addressing real-world challenges such as hallucinations in model output, inconsistencies in OSM tagging, and the noisy nature of user input. It combines a novel synthetic data pipeline with a semantic bundling system to enable robust, accurate query generation. To our knowledge, SPOT is the first system to achieve reliable natural language access to OSM data at this level of accuracy. By lowering the technical barrier to geolocation verification, SPOT contributes a practical tool to the broader efforts to support fact-checking and combat disinformation.
Multi-document Summarization through Multi-document Event Relation Graph Reasoning in LLMs: a case study in Framing Bias Mitigation
Media outlets are becoming more partisan and polarized nowadays. Most previous work focused on detecting media bias. In this paper, we aim to mitigate media bias by generating a neutralized summary given multiple articles presenting different ideological views. Motivated by the critical role of events and event relations in media bias detection, we propose to increase awareness of bias in LLMs via multi-document events reasoning and use a multi-document event relation graph to guide the summarization process. This graph contains rich event information useful to reveal bias: four common types of in-doc event relations to reflect content framing bias, cross-doc event coreference relation to reveal content selection bias, and event-level moral opinions to highlight opinionated framing bias. We further develop two strategies to incorporate the multi-document event relation graph for neutralized summarization. Firstly, we convert a graph into natural language descriptions and feed the textualized graph into LLMs as a part of a hard text prompt. Secondly, we encode the graph with graph attention network and insert the graph embedding into LLMs as a soft prompt. Both automatic evaluation and human evaluation confirm that our approach effectively mitigates both lexical and informational media bias, and meanwhile improves content preservation.
SurgBench: A Unified Large-Scale Benchmark for Surgical Video Analysis
Wei, Jianhui, Xiao, Zikai, Sun, Danyu, Gong, Luqi, Yang, Zongxin, Liu, Zuozhu, Wu, Jian
Surgical video understanding is pivotal for enabling automated intraoperative decision-making, skill assessment, and postoperative quality improvement. However, progress in developing surgical video foundation models (FMs) remains hindered by the scarcity of large-scale, diverse datasets for pretraining and systematic evaluation. In this paper, we introduce \textbf{SurgBench}, a unified surgical video benchmarking framework comprising a pretraining dataset, \textbf{SurgBench-P}, and an evaluation benchmark, \textbf{SurgBench-E}. SurgBench offers extensive coverage of diverse surgical scenarios, with SurgBench-P encompassing 53 million frames across 22 surgical procedures and 11 specialties, and SurgBench-E providing robust evaluation across six categories (phase classification, camera motion, tool recognition, disease diagnosis, action classification, and organ detection) spanning 72 fine-grained tasks. Extensive experiments reveal that existing video FMs struggle to generalize across varied surgical video analysis tasks, whereas pretraining on SurgBench-P yields substantial performance improvements and superior cross-domain generalization to unseen procedures and modalities. Our dataset and code are available upon request.
RealFactBench: A Benchmark for Evaluating Large Language Models in Real-World Fact-Checking
Yang, Shuo, Dai, Yuqin, Wang, Guoqing, Zheng, Xinran, Xu, Jinfeng, Li, Jinze, Ying, Zhenzhe, Wang, Weiqiang, Ngai, Edith C. H.
Large Language Models (LLMs) hold significant potential for advancing fact-checking by leveraging their capabilities in reasoning, evidence retrieval, and explanation generation. However, existing benchmarks fail to comprehensively evaluate LLMs and Multimodal Large Language Models (MLLMs) in realistic misinformation scenarios. To bridge this gap, we introduce RealFactBench, a comprehensive benchmark designed to assess the fact-checking capabilities of LLMs and MLLMs across diverse real-world tasks, including Knowledge Validation, Rumor Detection, and Event Verification. RealFactBench consists of 6K high-quality claims drawn from authoritative sources, encompassing multimodal content and diverse domains. Our evaluation framework further introduces the Unknown Rate (UnR) metric, enabling a more nuanced assessment of models' ability to handle uncertainty and balance between over-conservatism and over-confidence. Extensive experiments on 7 representative LLMs and 4 MLLMs reveal their limitations in real-world fact-checking and offer valuable insights for further research. RealFactBench is publicly available at https://github.com/kalendsyang/RealFactBench.git.
Information Suppression in Large Language Models: Auditing, Quantifying, and Characterizing Censorship in DeepSeek
Qiu, Peiran, Zhou, Siyi, Ferrara, Emilio
This study examines information suppression mechanisms in DeepSeek, an open-source large language model (LLM) developed in China. We propose an auditing framework and use it to analyze the model's responses to 646 politically sensitive prompts by comparing its final output with intermediate chain-of-thought (CoT) reasoning. Our audit unveils evidence of semantic-level information suppression in DeepSeek: sensitive content often appears within the model's internal reasoning but is omitted or rephrased in the final output. Specifically, DeepSeek suppresses references to transparency, government accountability, and civic mobilization, while occasionally amplifying language aligned with state propaganda. This study underscores the need for systematic auditing of alignment, content moderation, information suppression, and censorship practices implemented into widely-adopted AI models, to ensure transparency, accountability, and equitable access to unbiased information obtained by means of these systems.