Media
Release date for Apple's first FOLDABLE iPhone leaks online - and it suggests fans don't have long to wait at all
It is one of the world's leading tech companies but, unlike its rivals, Apple is yet to reveal its own folding phone design. Now, a possible release date for the long-rumoured foldable iPhone has leaked online - and it suggests tech fans don't have long to wait. According to reports from Apple analysts, the foldable iPhone could be launched before the end of 2026. The rumours also suggest that Apple's latest innovation won't come cheap, with an expected price tag of 2,299. That would make the'iPhone Fold' almost twice the price of Apple's current most expensive smartphone, the iPhone 16 Pro Max, which starts at 1,199 (UK price 1,199).
Performing arts leaders issue copyright warning over UK government's AI plans
More than 30 performing arts leaders in the UK, including the bosses of the National Theatre, Opera North and the Royal Albert Hall, have joined the chorus of creative industry concern about the government's plans to let artificial intelligence companies use artists' work without permission. They also urged the government to support the "moral and economic rights" of the creative community in music, dance, drama and opera. The 35 signatories of the statement include the chief executives of the Sadler's Wells dance theatre, the Royal Shakespeare Company, the City of Birmingham Symphony Orchestra and the Leeds Playhouse. The performing arts bosses added that they embraced advances in technology and were "participants" in innovation, but stated the government's plans risked undermining their ability to participate in the development and deployment of AI. Critics of the opt out plan have described it as unfair and impractical.
Iffy-Or-Not: Extending the Web to Support the Critical Evaluation of Fallacious Texts
Lim, Gionnieve, Kim, Juho, Perrault, Simon T.
Social platforms have expanded opportunities for deliberation with the comments being used to inform one's opinion. However, using such information to form opinions is challenged by unsubstantiated or false content. To enhance the quality of opinion formation and potentially confer resistance to misinformation, we developed Iffy-Or-Not (ION), a browser extension that seeks to invoke critical thinking when reading texts. With three features guided by argumentation theory, ION highlights fallacious content, suggests diverse queries to probe them with, and offers deeper questions to consider and chat with others about. From a user study (N=18), we found that ION encourages users to be more attentive to the content, suggests queries that align with or are preferable to their own, and poses thought-provoking questions that expands their perspectives. However, some participants expressed aversion to ION due to misalignments with their information goals and thinking predispositions. Potential backfiring effects with ION are discussed.
Learning on LLM Output Signatures for gray-box LLM Behavior Analysis
Bar-Shalom, Guy, Frasca, Fabrizio, Lim, Derek, Gelberg, Yoav, Ziser, Yftah, El-Yaniv, Ran, Chechik, Gal, Maron, Haggai
Large Language Models (LLMs) have achieved widespread adoption, yet our understanding of their behavior remains limited, particularly in detecting data contamination and hallucinations. While recently proposed probing techniques provide insights through activation analysis, they require "white-box" access to model internals, often unavailable. Current "gray-box" approaches typically analyze only the probability of the actual tokens in the sequence with simple task-specific heuristics. Importantly, these methods overlook the rich information contained in the full token distribution at each processing step. To address these limitations, we propose that gray-box analysis should leverage the complete observable output of LLMs, consisting of both the previously used token probabilities as well as the complete token distribution sequences - a unified data type we term LOS (LLM Output Signature). To this end, we develop a transformer-based approach to process LOS that theoretically guarantees approximation of existing techniques while enabling more nuanced analysis. Our approach achieves superior performance on hallucination and data contamination detection in gray-box settings, significantly outperforming existing baselines. Furthermore, it demonstrates strong transfer capabilities across datasets and LLMs, suggesting that LOS captures fundamental patterns in LLM behavior.
Sparse Autoencoder as a Zero-Shot Classifier for Concept Erasing in Text-to-Image Diffusion Models
Tian, Zhihua, Nan, Sirun, Xu, Ming, Zhai, Shengfang, Qu, Wenjie, Liu, Jian, Ren, Kui, Jia, Ruoxi, Zhang, Jiaheng
Text-to-image (T2I) diffusion models have achieved remarkable progress in generating high-quality images but also raise people's concerns about generating harmful or misleading content. While extensive approaches have been proposed to erase unwanted concepts without requiring retraining from scratch, they inadvertently degrade performance on normal generation tasks. In this work, we propose Interpret then Deactivate (ItD), a novel framework to enable precise concept removal in T2I diffusion models while preserving overall performance. ItD first employs a sparse autoencoder (SAE) to interpret each concept as a combination of multiple features. By permanently deactivating the specific features associated with target concepts, we repurpose SAE as a zero-shot classifier that identifies whether the input prompt includes target concepts, allowing selective concept erasure in diffusion models. Moreover, we demonstrate that ItD can be easily extended to erase multiple concepts without requiring further training. Comprehensive experiments across celebrity identities, artistic styles, and explicit content demonstrate ItD's effectiveness in eliminating targeted concepts without interfering with normal concept generation. Additionally, ItD is also robust against adversarial prompts designed to circumvent content filters. Code is available at: https://github.com/NANSirun/Interpret-then-deactivate.
Graph-CNNs for RF Imaging: Learning the Electric Field Integral Equations
Stylianopoulos, Kyriakos, Gavriilidis, Panagiotis, Gradoni, Gabriele, Alexandropoulos, George C.
Radio-Frequency (RF) imaging concerns the digital recreation of the surfaces of scene objects based on the scattered field at distributed receivers. To solve this difficult inverse scattering problems, data-driven methods are often employed that extract patterns from similar training examples, while offering minimal latency. In this paper, we first provide an approximate yet fast electromagnetic model, which is based on the electric field integral equations, for data generation, and subsequently propose a Deep Neural Network (DNN) architecture to learn the corresponding inverse model. A graph-attention backbone allows for the system geometry to be passed to the DNN, where residual convolutional layers extract features about the objects, while a UNet head performs the final image reconstruction. Our quantitative and qualitative evaluations on two synthetic data sets of different characteristics showcase the performance gains of thee proposed advanced architecture and its relative resilience to signal noise levels and various reception configurations.
Wiki-Quantities and Wiki-Measurements: Datasets of Quantities and their Measurement Context from Wikipedia
Göpfert, Jan, Kuckertz, Patrick, Weinand, Jann M., Stolten, Detlef
To cope with the large number of publications, more and more researchers are automatically extracting data of interest using natural language processing methods based on supervised learning. Much data, especially in the natural and engineering sciences, is quantitative, but there is a lack of datasets for identifying quantities and their context in text. To address this issue, we present two large datasets based on Wikipedia and Wikidata: Wiki-Quantities is a dataset consisting of over 1.2 million annotated quantities in the English-language Wikipedia. Wiki-Measurements is a dataset of 38 738 annotated quantities in the English-language Wikipedia along with their respective measured entity, property, and optional qualifiers. Manual validation of 100 samples each of Wiki-Quantities and Wiki-Measurements found 100% and 84-94% correct, respectively. The datasets can be used in pipeline approaches to measurement extraction, where quantities are first identified and then their measurement context. To allow reproduction of this work using newer or different versions of Wikipedia and Wikidata, we publish the code used to create the datasets along with the data.
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Arriola, Marianne, Gokaslan, Aaron, Chiu, Justin T, Yang, Zhihan, Qi, Zhixuan, Han, Jiaqi, Sahoo, Subham Sekhar, Kuleshov, Volodymyr
Diffusion language models offer unique benefits over autoregressive models due to their potential for parallelized generation and controllability, yet they lag in likelihood modeling and are limited to fixed-length generation. In this work, we introduce a class of block diffusion language models that interpolate between discrete denoising diffusion and autoregressive models. Block diffusion overcomes key limitations of both approaches by supporting flexible-length generation and improving inference efficiency with KV caching and parallel token sampling. We propose a recipe for building effective block diffusion models that includes an efficient training algorithm, estimators of gradient variance, and data-driven noise schedules to minimize the variance. Block diffusion sets a new state-of-the-art performance among diffusion models on language modeling benchmarks and enables generation of arbitrary-length sequences. We provide the code, along with the model weights and blog post on the project page: https://m-arriola.com/bd3lms/
Growing a Twig to Accelerate Large Vision-Language Models
Shao, Zhenwei, Wang, Mingyang, Yu, Zhou, Pan, Wenwen, Yang, Yan, Wei, Tao, Zhang, Hongyuan, Mao, Ning, Chen, Wei, Yu, Jun
Large vision-language models (VLMs) have demonstrated remarkable capabilities in open-world multimodal understanding, yet their high computational overheads pose great challenges for practical deployment. Some recent works have proposed methods to accelerate VLMs by pruning redundant visual tokens guided by the attention maps of VLM's early layers. Despite the success of these token pruning methods, they still suffer from two major shortcomings: (i) considerable accuracy drop due to insensitive attention signals in early layers, and (ii) limited speedup when generating long responses (e.g., 30 tokens). To address the limitations above, we present TwigVLM -- a simple and general architecture by growing a lightweight twig upon an early layer of the base VLM. Compared with most existing VLM acceleration methods purely based on visual token pruning, our TwigVLM not only achieves better accuracy retention by employing a twig-guided token pruning (TTP) strategy, but also yields higher generation speed by utilizing a self-speculative decoding (SSD) strategy. Taking LLaVA-1.5-7B as the base VLM, experimental results show that TwigVLM preserves 96% of the original performance after pruning 88.9% of visual tokens and achieves 154% speedup in generating long responses, delivering significantly better performance in terms of both accuracy and speed over the state-of-the-art VLM acceleration methods. Code will be made publicly available.
CRCE: Coreference-Retention Concept Erasure in Text-to-Image Diffusion Models
Xue, Yuyang, Moroshko, Edward, Chen, Feng, McDonagh, Steven, Tsaftaris, Sotirios A.
Text-to-Image diffusion models can produce undesirable content that necessitates concept erasure techniques. However, existing methods struggle with under-erasure, leaving residual traces of targeted concepts, or over-erasure, mistakenly eliminating unrelated but visually similar concepts. To address these limitations, we introduce CRCE, a novel concept erasure framework that leverages Large Language Models to identify both semantically related concepts that should be erased alongside the target and distinct concepts that should be preserved. By explicitly modeling coreferential and retained concepts semantically, CRCE enables more precise concept removal, without unintended erasure. Experiments demonstrate that CRCE outperforms existing methods on diverse erasure tasks.