Media
Weekly quiz: What did Taylor Swift buy back?
Weekly quiz: What did Taylor Swift buy back? This week saw Ukraine mount an audacious drone attack on Russian airfields, Donald Trump ban people in 12 countries from travelling to the US, while Billie Piper returned to Doctor Who. But how much attention did you pay to what else happened in the world? Try last week's quiz, or have a go at something from the archives.Taylor SwiftMusicRelated The 43-year-old global superstar performed the first night of her six show residency in London.2 The star says he is focusing on acting after parting ways with his record company.3 hrs agoCulture8 hrs ago The annual event kicks off on Thursday and runs until the end of June.8 hrs ago Northern Ireland13 hrs ago Cally Rhodes from Shrewsbury says being selected to perform with Murs at Ludlow Castle is "unreal".13 Paul Weller cover version'the ultimate tribute' to Friel The family Eamon Friel tell Paul Weller they are delighted with his version of El Dorado.15
Russia vows to repair planes damaged by Ukraine in massive drone attack, claims they were 'not destroyed'
Russia is vowing Thursday to repair the warplanes damaged by Ukraine in a massive drone attack earlier this week, with an official claiming they were "not destroyed but damaged." The comments from Russian Deputy Foreign Minister Sergey Ryabkov come after Ukraine said its forces destroyed 40 of Russia's most powerful bomber jets and surveillance planes in "Operation Spider's Web," a series of coordinated drone strikes Sunday penetrating deep into Russian territory. "As the defense ministry said, these aircraft were not destroyed but damaged. They will be repaired," Ryabkov was quoted telling Russia's state-run TASS news agency. However, satellite images of Russian airfields show extensive damage to the planes.
Amazon 'testing humanoid robots to deliver packages'
Amazon is reportedly developing software for humanoid robots that could perform the role of delivery workers and "spring out" of its vans. The 2tn ( 1.47tn) technology company is building a "humanoid park" in the US to test the robots, said the tech news site the Information, citing a person who had been involved in the project. The Information reported that the robots could eventually take the jobs of delivery workers. It is developing the artificial intelligence software that would power the robots but will use hardware developed by other companies. The indoor obstacle course being used for the tests at an Amazon office in San Francisco is about the size of a coffee shop, the report said, with the company hoping the robots will be able to travel in Amazon's Rivian vans and make deliveries from them.
EdgeVidSum: Real-Time Personalized Video Summarization at the Edge
Mujtaba, Ghulam, Ryu, Eun-Seok
EdgeVidSum is a lightweight method that generates personalized, fast-forward summaries of long-form videos directly on edge devices. The proposed approach enables real-time video summarization while safeguarding user privacy through local data processing using innovative thumbnail-based techniques and efficient neural architectures. Unlike conventional methods that process entire videos frame by frame, the proposed method uses thumbnail containers to significantly reduce computational complexity without sacrificing semantic relevance. The framework employs a hierarchical analysis approach, where a lightweight 2D CNN model identifies user-preferred content from thumbnails and generates timestamps to create fast-forward summaries. Our interactive demo highlights the system's ability to create tailored video summaries for long-form videos, such as movies, sports events, and TV shows, based on individual user preferences. The entire computation occurs seamlessly on resource-constrained devices like Jetson Nano, demonstrating how EdgeVidSum addresses the critical challenges of computational efficiency, personalization, and privacy in modern video consumption environments.
ScoreRAG: A Retrieval-Augmented Generation Framework with Consistency-Relevance Scoring and Structured Summarization for News Generation
This research introduces ScoreRAG, an approach to enhance the quality of automated news generation. Despite advancements in Natural Language Processing and large language models, current news generation methods often struggle with hallucinations, factual inconsistencies, and lack of domain-specific expertise when producing news articles. ScoreRAG addresses these challenges through a multi-stage framework combining retrieval-augmented generation, consistency relevance evaluation, and structured summarization. The system first retrieves relevant news documents from a vector database, maps them to complete news items, and assigns consistency relevance scores based on large language model evaluations. These documents are then reranked according to relevance, with low-quality items filtered out. The framework proceeds to generate graded summaries based on relevance scores, which guide the large language model in producing complete news articles following professional journalistic standards. Through this methodical approach, ScoreRAG aims to significantly improve the accuracy, coherence, informativeness, and professionalism of generated news articles while maintaining stability and consistency throughout the generation process. The code and demo are available at: https://github.com/peiyun2260/ScoreRAG.
Enhancing Automatic PT Tagging for MEDLINE Citations Using Transformer-Based Models
This study addresses limitations in the current automated indexing process, which relies on legacy NLP algorithms. We evaluated monolithic multi-label classifiers and binary classifier ensembles to enhance the retrieval of biomedical literature. Results demonstrate the potential of Transformer models to significantly improve PT tagging accuracy, paving the way for scalable, efficient biomedical indexing. Keywords: MEDLINE, MeSH Publication Types, Pre-trained Foundation Models, Natural Language Processing, Machine Learning 1. Introduction The MEDLINE indexed subset of the National Library of Medicine' s ( NLM ' s) PubMed service is a cornerstone of biomedical knowledge, housing millions of citations from journals worldwide. Its significance lies not only in its vast scope but also in its ability to organize and provide efficient access to this wealth of information.
Object-centric 3D Motion Field for Robot Learning from Human Videos
Yin, Zhao-Heng, Yang, Sherry, Abbeel, Pieter
Learning robot control policies from human videos is a promising direction for scaling up robot learning. However, how to extract action knowledge (or action representations) from videos for policy learning remains a key challenge. Existing action representations such as video frames, pixelflow, and pointcloud flow have inherent limitations such as modeling complexity or loss of information. In this paper, we propose to use object-centric 3D motion field to represent actions for robot learning from human videos, and present a novel framework for extracting this representation from videos for zero-shot control. We introduce two novel components in its implementation. First, a novel training pipeline for training a ''denoising'' 3D motion field estimator to extract fine object 3D motions from human videos with noisy depth robustly. Second, a dense object-centric 3D motion field prediction architecture that favors both cross-embodiment transfer and policy generalization to background. We evaluate the system in real world setups. Experiments show that our method reduces 3D motion estimation error by over 50% compared to the latest method, achieve 55% average success rate in diverse tasks where prior approaches fail~($\lesssim 10$\%), and can even acquire fine-grained manipulation skills like insertion.
Music Interpretation and Emotion Perception: A Computational and Neurophysiological Investigation
Lyberatos, Vassilis, Kantarelis, Spyridon, Zioga, Ioanna, Anagnostopoulou, Christina, Stamou, Giorgos, Georgaki, Anastasia
These authors contributed equally to this work. ABSTRACT This study investigates emotional expression and perception in music performance using computational and neurophysiological methods. The influence of different performance settings, such as repertoire, diatonic modal etudes, and improvisation, as well as levels of expressiveness, on performers' emotional communication and listeners' reactions is explored. Professional musicians performed various tasks, and emotional annotations were provided by both performers and the audience. Audio analysis revealed that expressive and improvisational performances exhibited unique acoustic features, while emotion analysis showed stronger emotional responses. Neurophysiological measurements indicated greater relaxation in improvisa-tional performances. This multimodal study highlights the significance of expressivity in enhancing emotional communication and audience engagement. 1. INTRODUCTION In recent years, the study of music performance has become a prominent area of research. While traditional analysis of music often relied on the score, modern research highlights the importance of performance-specific features that distinguish one rendition from another.
The Arabic AI Fingerprint: Stylometric Analysis and Detection of Large Language Models Text
Al-Shaibani, Maged S., Ahmed, Moataz
Large Language Models (LLMs) have achieved unprecedented capabilities in generating human-like text, posing subtle yet significant challenges for information integrity across critical domains, including education, social media, and academia, enabling sophisticated misinformation campaigns, compromising healthcare guidance, and facilitating targeted propaganda. This challenge becomes severe, particularly in under-explored and low-resource languages like Arabic. This paper presents a comprehensive investigation of Arabic machine-generated text, examining multiple generation strategies (generation from the title only, content-aware generation, and text refinement) across diverse model architectures (ALLaM, Jais, Llama, and GPT-4) in academic, and social media domains. Our stylometric analysis reveals distinctive linguistic patterns differentiating human-written from machine-generated Arabic text across these varied contexts. Despite their human-like qualities, we demonstrate that LLMs produce detectable signatures in their Arabic outputs, with domain-specific characteristics that vary significantly between different contexts. Based on these insights, we developed BERT-based detection models that achieved exceptional performance in formal contexts (up to 99.9\% F1-score) with strong precision across model architectures. Our cross-domain analysis confirms generalization challenges previously reported in the literature. To the best of our knowledge, this work represents the most comprehensive investigation of Arabic machine-generated text to date, uniquely combining multiple prompt generation methods, diverse model architectures, and in-depth stylometric analysis across varied textual domains, establishing a foundation for developing robust, linguistically-informed detection systems essential for preserving information integrity in Arabic-language contexts.
MemeReaCon: Probing Contextual Meme Understanding in Large Vision-Language Models
Zhao, Zhengyi, Zhang, Shubo, Zhang, Yuxi, Zhao, Yanxi, Zhang, Yifan, Wang, Zezhong, Wang, Huimin, Zhao, Yutian, Liang, Bin, Zheng, Yefeng, Li, Binyang, Wong, Kam-Fai, Wu, Xian
Memes have emerged as a popular form of multimodal online communication, where their interpretation heavily depends on the specific context in which they appear. Current approaches predominantly focus on isolated meme analysis, either for harmful content detection or standalone interpretation, overlooking a fundamental challenge: the same meme can express different intents depending on its conversational context. This oversight creates an evaluation gap: although humans intuitively recognize how context shapes meme interpretation, Large Vision Language Models (LVLMs) can hardly understand context-dependent meme intent. To address this critical limitation, we introduce MemeReaCon, a novel benchmark specifically designed to evaluate how LVLMs understand memes in their original context. We collected memes from five different Reddit communities, keeping each meme's image, the post text, and user comments together. We carefully labeled how the text and meme work together, what the poster intended, how the meme is structured, and how the community responded. Our tests with leading LVLMs show a clear weakness: models either fail to interpret critical information in the contexts, or overly focus on visual details while overlooking communicative purpose. MemeReaCon thus serves both as a diagnostic tool exposing current limitations and as a challenging benchmark to drive development toward more sophisticated LVLMs of the context-aware understanding.