Media
Watch: Winning moments from the 77th Emmy Awards
The 77th Primetime Emmy Awards have taken place in Los Angeles on Sunday night, with shows The Studio, The Pit and Adolescence dominating the awards. Owen Cooper became the youngest ever male Emmy winner at 15-years-old, for his breakout role in the Netflix miniseries Adolescence. Seth Rogan's comedy series The Studio scooped up four Emmys, while The Pitt beat out the likes of Severance and The White Lotus to win Best Drama. 'No doubt' Russia will cross Nato border if Ukraine falls, former US VP says Former US Vice-President Mike Pence calls for security guarantees in Ukraine to help deliver "just and lasting peace". The US House Oversight Committee has released new surveillance footage recorded hours before the convicted paedophile's death.
Google Pixel 10 Pro review: one of the very best smaller phones
The Pixel 10 Pro offers the best of Google's hardware without an enormous screen, making it a contender for the top smaller phone. The Pixel 10 Pro offers the best of Google's hardware without an enormous screen, making it a contender for the top smaller phone. Mon 15 Sep 2025 02.00 EDTLast modified on Mon 15 Sep 2025 02.03 EDT The Guardian's journalism is independent. We will earn a commission if you buy something through an affiliate link. The Pixel 10 Pro is Google's best phone that is still a pocketable, easy-to-handle size, taking the excellent Pixel 10 and beefing it up in the camera department.
Sony's 'Demon Slayer' sets anime record in North American box office debut
A statue of young swordsman Tanjiro Kamado, the title character in the Demon Slayer anime franchise, is displayed at a cinema in Tokyo's Ikebukuro district on Sept. 9. | AFP-JIJI Sony Pictures' Demon Slayer: Kimetsu no Yaiba The Movie: Infinity Castle has become the top movie in U.S. and Canadian theaters, setting an opening weekend record for the Japanese anime genre. The film, about a boy who joins an organization that hunts demons after his little sister is turned into one, brought in $70 million, the division of Sony said Sunday. That topped the previous $31 million record set by Pokemon: The First Movie -- Mewtwo Strikes Back in 1999. Box Office Pro had forecast weekend sales of $60 million to $75 million. The opening underscores the surging global appeal of Japanese anime.
Testing chatbots on the creation of encoders for audio conditioned image generation
Leรณn, Jorge E., Carrasco, Miguel
On one hand, recent advances in chatbots has led to a rising popularity in using these models for coding tasks. On the other hand, modern generative image models primarily rely on text encoders to translate semantic concepts into visual representations, even when there is clear evidence that audio can be employed as input as well. Given the previous, in this work, we explore whether state-of-the-art conversational agents can design effective audio encoders to replace the CLIP text encoder from Stable Diffusion 1.5, enabling image synthesis directly from sound. We prompted five publicly available chatbots to propose neural architectures to work as these audio encoders, with a set of well-explained shared conditions. Each valid suggested encoder was trained on over two million context related audio-image-text observations, and evaluated on held-out validation and test sets using various metrics, together with a qualitative analysis of their generated images. Although almost all chatbots generated valid model designs, none achieved satisfactory results, indicating that their audio embeddings failed to align reliably with those of the original text encoder. Among the proposals, the Gemini audio encoder showed the best quantitative metrics, while the Grok audio encoder produced more coherent images (particularly, when paired with the text encoder). Our findings reveal a shared architectural bias across chatbots and underscore the remaining coding gap that needs to be bridged in future versions of these models. We also created a public demo so everyone could study and try out these audio encoders. Finally, we propose research questions that should be tackled in the future, and encourage other researchers to perform more focused and highly specialized tasks like this one, so the respective chatbots cannot make use of well-known solutions and their creativity/reasoning is fully tested.
Personas within Parameters: Fine-Tuning Small Language Models with Low-Rank Adapters to Mimic User Behaviors
Thakur, Himanshu, Agrawal, Eshani, Mukund, Smruthi
A long-standing challenge in developing accurate recommendation models is simulating user behavior, mainly due to the complex and stochastic nature of user interactions. Towards this, one promising line of work has been the use of Large Language Models (LLMs) for simulating user behavior. However, aligning these general-purpose large pre-trained models with user preferences necessitates: (i) effectively and continously parsing large-scale tabular user-item interaction data, (ii) overcoming pre-training-induced inductive biases to accurately learn user specific knowledge, and (iii) achieving the former two at scale for millions of users. While most previous works have focused on complex methods to prompt an LLM or fine-tune it on tabular interaction datasets, our approach shifts the focus to extracting robust textual user representations using a frozen LLM and simulating cost-effective, resource-efficient user agents powered by fine-tuned Small Language Models (SLMs). Further, we showcase a method for training multiple low-rank adapters for groups of users or \textit{persona}, striking an optimal balance between scalability and performance of user behavior agents. Our experiments provide compelling empirical evidence of the efficacy of our methods, demonstrating that user agents developed using our approach have the potential to bridge the gap between offline metrics and real-world performance of recommender systems.
Generative Engine Optimization: How to Dominate AI Search
Chen, Mahe, Wang, Xiaoxuan, Chen, Kaiwen, Koudas, Nick
The rapid adoption of generative AI-powered search engines like ChatGPT, Perplexity, and Gemini is fundamentally reshaping information retrieval, moving from traditional ranked lists to synthesized, citation-backed answers. This shift challenges established Search Engine Optimization (SEO) practices and necessitates a new paradigm, which we term Generative Engine Optimization (GEO). This paper presents a comprehensive comparative analysis of AI Search and traditional web search (Google). Through a series of large-scale, controlled experiments across multiple verticals, languages, and query paraphrases, we quantify critical differences in how these systems source information. Our key findings reveal that AI Search exhibit a systematic and overwhelming bias towards Earned media (third-party, authoritative sources) over Brand-owned and Social content, a stark contrast to Google's more balanced mix. We further demonstrate that AI Search services differ significantly from each other in their domain diversity, freshness, cross-language stability, and sensitivity to phrasing. Based on these empirical results, we formulate a strategic GEO agenda. We provide actionable guidance for practitioners, emphasizing the critical need to: (1) engineer content for machine scannability and justification, (2) dominate earned media to build AI-perceived authority, (3) adopt engine-specific and language-aware strategies, and (4) overcome the inherent "big brand bias" for niche players. Our work provides the foundational empirical analysis and a strategic framework for achieving visibility in the new generative search landscape.
The power of text similarity in identifying AI-LLM paraphrased documents: The case of BBC news articles and ChatGPT
Xylogiannopoulos, Konstantinos, Xanthopoulos, Petros, Karampelas, Panagiotis, Bakamitsos, Georgios
Generative AI paraphrased text can be used for copyright infringement and the AI paraphrased content can deprive substantial revenue from original content creators. Despite this recent surge of malicious use of generative AI, there are few academic publications that research this threat. In this article, we demonstrate the ability of pattern-based similarity detection for AI paraphrased news recognition. We propose an algorithmic scheme, which is not limited to detect whether an article is an AI paraphrase, but, more importantly, to identify that the source of infringement is the ChatGPT. The proposed method is tested with a benchmark dataset specifically created for this task that incorporates real articles from BBC, incorporating a total of 2,224 articles across five different news categories, as well as 2,224 paraphrased articles created with ChatGPT. Results show that our pattern similarity-based method, that makes no use of deep learning, can detect ChatGPT assisted paraphrased articles at percentages 96.23% for accuracy, 96.25% for precision, 96.21% for sensitivity, 96.25% for specificity and 96.23% for F1 score.
Model-agnostic post-hoc explainability for recommender systems
Arรฉvalo, Irina, Salmeron, Jose L
Recommender systems often benefit from complex feature embeddings and deep learning algorithms, which deliver sophisticated recommendations that enhance user experience, engagement, and revenue. However, these methods frequently reduce the interpretability and transparency of the system. In this research, we develop a systematic application, adaptation, and evaluation of deletion diagnostics in the recommender setting. The method compares the performance of a model to that of a similar model trained without a specific user or item, allowing us to quantify how that observation influences the recommender, either positively or negatively. To demonstrate its model-agnostic nature, the proposal is applied to both Neural Collaborative Filtering (NCF), a widely used deep learning-based recommender, and Singular Value Decomposition (SVD), a classical collaborative filtering technique. Experiments on the MovieLens and Amazon Reviews datasets provide insights into model behavior and highlight the generality of the approach across different recommendation paradigms.
Multimodal Mathematical Reasoning Embedded in Aerial Vehicle Imagery: Benchmarking, Analysis, and Exploration
Zhou, Yue, Feng, Litong, Lan, Mengcheng, Yang, Xue, Li, Qingyun, Ke, Yiping, Jiang, Xue, Zhang, Wayne
Mathematical reasoning is critical for tasks such as precise distance and area computations, trajectory estimations, and spatial analysis in unmanned aerial vehicle (UAV) based remote sensing, yet current vision-language models (VLMs) have not been adequately tested in this domain. To address this gap, we introduce AVI-Math, the first benchmark to rigorously evaluate multimodal mathematical reasoning in aerial vehicle imagery, moving beyond simple counting tasks to include domain-specific knowledge in areas such as geometry, logic, and algebra. The dataset comprises 3,773 high-quality vehicle-related questions captured from UAV views, covering 6 mathematical subjects and 20 topics. The data, collected at varying altitudes and from multiple UAV angles, reflects real-world UAV scenarios, ensuring the diversity and complexity of the constructed mathematical problems. In this paper, we benchmark 14 prominent VLMs through a comprehensive evaluation and demonstrate that, despite their success on previous multimodal benchmarks, these models struggle with the reasoning tasks in AVI-Math. Our detailed analysis highlights significant limitations in the mathematical reasoning capabilities of current VLMs and suggests avenues for future research. Furthermore, we explore the use of Chain-of-Thought prompting and fine-tuning techniques, which show promise in addressing the reasoning challenges in AVI-Math. Our findings not only expose the limitations of VLMs in mathematical reasoning but also offer valuable insights for advancing UAV-based trustworthy VLMs in real-world applications. The code, and datasets will be released at https://github.com/VisionXLab/avi-math
Pragmatic Frames Evoked by Gestures: A FrameNet Brasil Approach to Multimodality in Turn Organization
Abreu, Helen de Andrade, Torrent, Tiago Timponi, Matos, Ely Edison da Silva
This paper proposes a framework for modeling multimodal conversational turn organization via the proposition of correlations between language and interactive gestures, based on analysis as to how pragmatic frames are conceptualized and evoked by communicators. As a means to provide evidence for the analysis, we developed an annotation methodology to enrich a multimodal dataset (annotated for semantic frames) with pragmatic frames modeling conversational turn organization. Although conversational turn organization has been studied by researchers from diverse fields, the specific strategies, especially gestures used by communicators, had not yet been encoded in a dataset that can be used for machine learning. To fill this gap, we enriched the Frame2 dataset with annotations of gestures used for turn organization. The Frame2 dataset features 10 episodes from the Brazilian TV series Pedro Pelo Mundo annotated for semantic frames evoked in both video and text. This dataset allowed us to closely observe how communicators use interactive gestures outside a laboratory, in settings, to our knowledge, not previously recorded in related literature. Our results have confirmed that communicators involved in face-to-face conversation make use of gestures as a tool for passing, taking and keeping conversational turns, and also revealed variations of some gestures that had not been documented before. We propose that the use of these gestures arises from the conceptualization of pragmatic frames, involving mental spaces, blending and conceptual metaphors. In addition, our data demonstrate that the annotation of pragmatic frames contributes to a deeper understanding of human cognition and language.