Media
Meta shifts some metaverse investments to AI smart glasses
Meta is shifting some of its investments in the metaverse to AI glasses and wearables, hoping to capitalise on the momentum in that segment, a company spokesperson has said. Over the last decade, Meta has poured billions of dollars to build the metaverse, which lets people to interact in a virtual reality. However, the tech giant has struggled to convince investors of the viability of the nascent technology. Bloomberg first reported on Thursday that Meta would cut its metaverse investment by as much as 30%. Its shares climbed more than 3.4% following the news.
'It was about degrading someone completely': the story of Mr DeepFakes – the world's most notorious AI porn site
'It was about degrading someone completely': the story of Mr DeepFakes - the world's most notorious AI porn site The hobbyists who helped build this site created technology that has been used to humiliate countless women. Why didn't governments step in and stop them? For Patrizia Schlosser, it started with an apologetic call from a colleague. "I'm sorry but I found this. Are you aware of it?"
We would sell books by AI, says Waterstones boss
Waterstones would stock books created using artificial intelligence, the company's boss has said, as long as they were clearly labelled, and if customers wanted them. However, James Daunt, a veteran of the bookselling industry, said he personally did not expect that to happen. There's a huge proliferation of AI generated content and most of it are not books that we should be selling, he said. But it would be up to the reader. An explosion in the use of artificial intelligence, or AI, has prompted heated debate in the publishing industry, with writers concerned about the impact on their livelihoods.
Reflection Removal through Efficient Adaptation of Diffusion Transformers
Zakarin, Daniyar, Wandel, Thiemo, Obukhov, Anton, Dai, Dengxin
We introduce a diffusion-transformer (DiT) framework for single-image reflection removal that leverages the generalization strengths of foundation diffusion models in the restoration setting. Rather than relying on task-specific architectures, we repurpose a pre-trained DiT-based foundation model by conditioning it on reflection-contaminated inputs and guiding it toward clean transmission layers. We systematically analyze existing reflection removal data sources for diversity, scalability, and photorealism. To address the shortage of suitable data, we construct a physically based rendering (PBR) pipeline in Blender, built around the Principled BSDF, to synthesize realistic glass materials and reflection effects. Efficient LoRA-based adaptation of the foundation model, combined with the proposed synthetic data, achieves state-of-the-art performance on in-domain and zero-shot benchmarks. These results demonstrate that pretrained diffusion transformers, when paired with physically grounded data synthesis and efficient adaptation, offer a scalable and high-fidelity solution for reflection removal. Project page: https://hf.co/spaces/huawei-bayerlab/windowseat-reflection-removal-web
YingMusic-Singer: Zero-shot Singing Voice Synthesis and Editing with Annotation-free Melody Guidance
Zheng, Junjie, Hao, Chunbo, Ma, Guobin, Zhang, Xiaoyu, Chen, Gongyu, Ding, Chaofan, Chen, Zihao, Xie, Lei
Singing Voice Synthesis (SVS) remains constrained in practical deployment due to its strong dependence on accurate phoneme-level alignment and manually annotated melody contours, requirements that are resource-intensive and hinder scalability. To overcome these limitations, we propose a melody-driven SVS framework capable of synthesizing arbitrary lyrics following any reference melody, without relying on phoneme-level alignment. Our method builds on a Diffusion Transformer (DiT) architecture, enhanced with a dedicated melody extraction module that derives melody representations directly from reference audio. To ensure robust melody encoding, we employ a teacher model to guide the optimization of the melody extractor, alongside an implicit alignment mechanism that enforces similarity distribution constraints for improved melodic stability and coherence. Additionally, we refine duration modeling using weakly annotated song data and introduce a Flow-GRPO reinforcement learning strategy with a multi-objective reward function to jointly enhance pronunciation clarity and melodic fidelity. Experiments show that our model achieves superior performance over existing approaches in both objective measures and subjective listening tests, especially in zero-shot and lyric adaptation settings, while maintaining high audio quality without manual annotation. This work offers a practical and scalable solution for advancing data-efficient singing voice synthesis. To support reproducibility, we release our inference code and model checkpoints.
Challenging the Abilities of Large Language Models in Italian: a Community Initiative
Nissim, Malvina, Croce, Danilo, Patti, Viviana, Basile, Pierpaolo, Attanasio, Giuseppe, Musacchio, Elio, Rinaldi, Matteo, Borazio, Federico, Francis, Maria, Gili, Jacopo, Scalena, Daniel, Altuna, Begoña, Azurmendi, Ekhi, Basile, Valerio, Bentivogli, Luisa, Bisazza, Arianna, Bolognesi, Marianna, Brunato, Dominique, Caselli, Tommaso, Casola, Silvia, Cassese, Maria, Cettolo, Mauro, Collacciani, Claudia, De Cosmo, Leonardo, Di Buono, Maria Pia, Esuli, Andrea, Etxaniz, Julen, Ferrando, Chiara, Fidelangeli, Alessia, Frenda, Simona, Fusco, Achille, Gaido, Marco, Galassi, Andrea, Galli, Federico, Giordano, Luca, Goffetti, Mattia, Gonzalez-Dios, Itziar, Gregori, Lorenzo, Grundler, Giulia, Iannaccone, Sandro, Jiang, Chunyang, La Quatra, Moreno, Lagioia, Francesca, Lo, Soda Marem, Madeddu, Marco, Magnini, Bernardo, Manna, Raffaele, Mercorio, Fabio, Merlo, Paola, Muti, Arianna, Nastase, Vivi, Negri, Matteo, Onorati, Dario, Palmieri, Elena, Papi, Sara, Passaro, Lucia, Pensa, Giulia, Piergentili, Andrea, Potertì, Daniele, Puccetti, Giovanni, Ranaldi, Federico, Ranaldi, Leonardo, Ravelli, Andrea Amelio, Rosola, Martina, Ruzzetti, Elena Sofia, Samo, Giuseppe, Santilli, Andrea, Santin, Piera, Sarti, Gabriele, Sartor, Giovanni, Savoldi, Beatrice, Serino, Antonio, Seveso, Andrea, Siciliani, Lucia, Torroni, Paolo, Varvara, Rossella, Zaninello, Andrea, Zanollo, Asya, Zanzotto, Fabio Massimo, Zeinalipour, Kamyar, Zugarini, Andrea
The rapid progress of Large Language Models (LLMs) has transformed natural language processing and broadened its impact across research and society. Yet, systematic evaluation of these models, especially for languages beyond English, remains limited. "Challenging the Abilities of LAnguage Models in ITAlian" (CALAMITA) is a large-scale collaborative benchmarking initiative for Italian, coordinated under the Italian Association for Computational Linguistics. Unlike existing efforts that focus on leaderboards, CALAMITA foregrounds methodology: it federates more than 80 contributors from academia, industry, and the public sector to design, document, and evaluate a diverse collection of tasks, covering linguistic competence, commonsense reasoning, factual consistency, fairness, summarization, translation, and code generation. Through this process, we not only assembled a benchmark of over 20 tasks and almost 100 subtasks, but also established a centralized evaluation pipeline that supports heterogeneous datasets and metrics. We report results for four open-weight LLMs, highlighting systematic strengths and weaknesses across abilities, as well as challenges in task-specific evaluation. Beyond quantitative results, CALAMITA exposes methodological lessons: the necessity of fine-grained, task-representative metrics, the importance of harmonized pipelines, and the benefits and limitations of broad community engagement. CALAMITA is conceived as a rolling benchmark, enabling continuous integration of new tasks and models. This makes it both a resource -- the most comprehensive and diverse benchmark for Italian to date -- and a framework for sustainable, community-driven evaluation. We argue that this combination offers a blueprint for other languages and communities seeking inclusive and rigorous LLM evaluation practices.
Bayes-DIC Net: Estimating Digital Image Correlation Uncertainty with Bayesian Neural Networks
Chen, Biao, Lei, Zhenhua, Zhang, Yahui, Niu, Tongzhi
This paper introduces a novel method for generating high-quality Digital Image Correlation (DIC) dataset based on non-uniform B-spline surfaces. By randomly generating control point coordinates, we construct displacement fields that encompass a variety of realistic displacement scenarios, which are subsequently used to generate speckle pattern datasets. This approach enables the generation of a large-scale dataset that capture real-world displacement field situations, thereby enhancing the training and generalization capabilities of deep learning-based DIC algorithms. Additionally, we propose a novel network architecture, termed Bayes-DIC Net, which extracts information at multiple levels during the down-sampling phase and facilitates the aggregation of information across various levels through a single skip connection during the up-sampling phase. Bayes-DIC Net incorporates a series of lightweight convolutional blocks designed to expand the receptive field and capture rich contextual information while minimizing computational costs. Furthermore, by integrating appropriate dropout modules into Bayes-DIC Net and activating them during the network inference stage, Bayes-DIC Net is transformed into a Bayesian neural network. This transformation allows the network to provide not only predictive results but also confidence levels in these predictions when processing real unlabeled datasets. This feature significantly enhances the practicality and reliability of our network in real-world displacement field prediction tasks. Through these innovations, this paper offers new perspectives and methods for dataset generation and algorithm performance enhancement in the field of DIC.
Evaluating Long-Context Reasoning in LLM-Based WebAgents
Chung, Andy, Zhang, Yichi, Lin, Kaixiang, Rawal, Aditya, Gao, Qiaozi, Chai, Joyce
As large language model (LLM)-based agents become increasingly integrated into daily digital interactions, their ability to reason across long interaction histories becomes crucial for providing personalized and contextually aware assistance. However, the performance of these agents in long context scenarios, particularly for action-taking WebAgents operating in realistic web environments, remains largely unexplored. This paper introduces a benchmark for evaluating long context reasoning capabilities of WebAgents through sequentially dependent subtasks that require retrieval and application of information from extended interaction histories. We develop a novel evaluation framework that simulates multi-session user interactions by injecting irrelevant task trajectories between dependent subtasks, creating contexts ranging from 25,000 to 150,000 tokens. Through extensive evaluation of four popular models, Claude-3.7, GPT-4.1, Llama 4, and o4-mini, we observe a dramatic performance degradation as context length increases, with success rates dropping from 40-50\% in baseline conditions to less than 10\% in long context scenarios. Our detailed error analysis reveals that agents primarily fail due to getting stuck in loops and losing track of original task objectives. We further propose an implicit RAG approach that provides modest improvements by generating task-relevant summaries, though fundamental limitations in long context reasoning persist. These findings highlight critical challenges for deploying WebAgents in realistic, long-term user interaction scenarios and provide insights for developing more robust agent architectures capable of maintaining coherent task execution across extended contexts.
Artificial Intelligence / Human Intelligence: Who Controls Whom?
Using the example of the film 2001: A Space Odyssey, this chapter illustrates the challenges posed by an AI capable of making decisions that go against human interests. But are human decisions always rational and ethical? In reality, the cognitive decision-making process is influenced by cognitive biases that affect our behavior and choices. AI not only reproduces these biases, but can also exploit them, with the potential to shape our decisions and judgments. Behind IA algorithms, there are sometimes individuals who show little concern for fundamental rights and impose their own rules. To address the ethical and societal challenges raised by AI and its governance, the regulation of digital platforms and education are keys levers. Regulation must reflect ethical, legal, and political choices, while education must strengthen digital literacy and teach people to make informed and critical choices when facing digital technologies.
ReVeal-MT: A Physics-Informed Neural Network for Multi-Transmitter Radio Environment Mapping
Shahid, Mukaram, Das, Kunal, Ushaq, Hadia, Zhang, Hongwei, Song, Jiming, Qiao, Daji, Babu, Sarath, Guan, Yong, Zhu, Zhengyuan, Ahmad, Arsalan
This manuscript has been submitted for peer review and possible publication in an IEEE journal. The content herein represents the version prepared by the authors and may be subject to further revision during the review. Abstract--Accurately mapping the radio environment (e.g., identifying wireless signal strength at specific frequency bands and geographic locations) is crucial for efficient spectrum sharing, enabling Secondary Users (SUs) to access underutilized spectrum bands while protecting Primary Users (PUs). While existing models have made progress, they often degrade in performance when multiple transmitters coexist, due to the compounded effects of shadowing, interference from adjacent transmitters. T o address this challenge, we extend our prior work on Physics-Informed Neural Networks (PINNs) for single-transmitter mapping to derive a new multi-transmitter Partial Differential Equation (PDE) formulation of the Received Signal Strength Indicator (RSSI). We then propose ReV eal-MT (Re-constructor and Visualizer of Spectrum Landscape for Multiple Transmitters), a novel PINN which integrates the multi-source PDE residual into a neural network loss function, enabling accurate spectrum landscape reconstruction from sparse RF sensor measurements. ReV eal-MT is validated using real-world measurements from the ARA wireless living lab across rural and suburban environments, and benchmarked against 3GPP and ITU-R channel models and a baseline PINN model for a single transmitter use-case. Results show that ReV eal-MT achieves substantial accuracy gains in multi-transmitter scenarios, e.g., achieving an RMSE of only 2.66 dB with as few as 45 samples over a 370-square-kilometer region, while maintaining low computational complexity. These findings demonstrate that ReV eal-MT significantly advances radio environment mapping under realistic multi-transmitter conditions, with strong potential for enabling fine-grained spectrum management and precise coexistence between PUs and SUs. I. INTRODUCTION Existing spectrum sharing frameworks, such as those implemented in the TV White Space (TVWS) database and Citizens Broadband Radio Service (CBRS) Spectrum Access System (SAS), rely heavily on traditional statistical models. However, such models struggle to accurately capture the real-world spectrum occupancy and do not generalize well enough to capture shadowing and fading caused by different kinds of terrain and environmental conditions, leading to conservative approaches that over-protect the primary users (PUs) and cause discrepancies in channel availability for spectrum re-use [1]- [3].