Generative AI
A Appendix
KAN oversaw the project and contributed valuable feedback. MindEye was developed using a training and validation set of Subject 1's data, with the test set (and other subjects' data) untouched until final PyTorch code for the MLP backbone and projector is depicted in Algorithm 1. Specifics on how we DALL-E 2. This makes our prior much faster at inference time. For simplicity we use bidirectional attention in our final model. To map to Stable Diffusion's V AE latent space we use a low-level pipeline with the same architecture as the high level pipeline. Recent works in low-level vision (super-resolution, denoising, deblurring, etc.) have observed that This performs worse than only applying the loss in latent space and also requires significantly more GPU memory.
How California's New AI Law Protects Whistleblowers
Booth is a reporter at TIME. Governor Gavin Newsom speaks at Google about preparing students and workers for the next generation of technology, in San Francisco, California, on August 7, 2025. Governor Gavin Newsom speaks at Google about preparing students and workers for the next generation of technology, in San Francisco, California, on August 7, 2025. Booth is a reporter at TIME. CEOs of the companies racing to build smarter AI--Google DeepMind, OpenAI, xAI, and Anthropic--have been clear about the stakes.
Benchmark It Yourself (BIY): Preparing a Dataset and Benchmarking AI Models for Scatterplot-Related Tasks
Palmeiro, João, Duarte, Diogo, Costa, Rita, Bizarro, Pedro
AI models are increasingly used for data analysis and visualization, yet benchmarks rarely address scatterplot-specific tasks, limiting insight into performance. To address this gap for one of the most common chart types, we introduce a synthetic, annotated dataset of over 18,000 scatterplots from six data generators and 17 chart designs, and a benchmark based on it. We evaluate proprietary models from OpenAI and Google using N-shot prompting on five distinct tasks derived from annotations of cluster bounding boxes, their center coordinates, and outlier coordinates. OpenAI models and Gemini 2.5 Flash, especially when prompted with examples, are viable options for counting clusters and, in Flash's case, outliers (90%+ Accuracy). However, the results for localization-related tasks are unsatisfactory: Precision and Recall are near or below 50%, except for Flash in outlier identification (65.01%). Furthermore, the impact of chart design on performance appears to be a secondary factor, but it is advisable to avoid scatterplots with wide aspect ratios (16:9 and 21:9) or those colored randomly. Large Language Models (LLMs), particularly multimodal models, are among today's key digital technologies.
$\bf{D^3}$QE: Learning Discrete Distribution Discrepancy-aware Quantization Error for Autoregressive-Generated Image Detection
Zhang, Yanran, Yu, Bingyao, Zheng, Yu, Zheng, Wenzhao, Duan, Yueqi, Chen, Lei, Zhou, Jie, Lu, Jiwen
The emergence of visual autoregressive (AR) models has revolutionized image generation while presenting new challenges for synthetic image detection. Unlike previous GAN or diffusion-based methods, AR models generate images through discrete token prediction, exhibiting both marked improvements in image synthesis quality and unique characteristics in their vector-quantized representations. In this paper, we propose to leverage Discrete Distribution Discrepancy-aware Quantization Error (D$^3$QE) for autoregressive-generated image detection that exploits the distinctive patterns and the frequency distribution bias of the codebook existing in real and fake images. We introduce a discrete distribution discrepancy-aware transformer that integrates dynamic codebook frequency statistics into its attention mechanism, fusing semantic features and quantization error latent. To evaluate our method, we construct a comprehensive dataset termed ARForensics covering 7 mainstream visual AR models. Experiments demonstrate superior detection accuracy and strong generalization of D$^3$QE across different AR models, with robustness to real-world perturbations. Code is available at \href{https://github.com/Zhangyr2022/D3QE}{https://github.com/Zhangyr2022/D3QE}.
Beyond Spectral Peaks: Interpreting the Cues Behind Synthetic Image Detection
Mandelli, Sara, Vila-Portela, Diego, Vázquez-Padín, David, Bestagini, Paolo, Pérez-González, Fernando
Over the years, the forensics community has proposed several deep learning-based detectors to mitigate the risks of generative AI. Recently, frequency-domain artifacts (particularly periodic peaks in the magnitude spectrum), have received significant attention, as they have been often considered a strong indicator of synthetic image generation. However, state-of-the-art detectors are typically used as black-boxes, and it still remains unclear whether they truly rely on these peaks. This limits their interpretability and trust. In this work, we conduct a systematic study to address this question. We propose a strategy to remove spectral peaks from images and analyze the impact of this operation on several detectors. In addition, we introduce a simple linear detector that relies exclusively on frequency peaks, providing a fully interpretable baseline free from the confounding influence of deep learning. Our findings reveal that most detectors are not fundamentally dependent on spectral peaks, challenging a widespread assumption in the field and paving the way for more transparent and reliable forensic tools.
Generative AI-Driven Hierarchical Multi-Agent Framework for Zero-Touch Optical Networks
Zhang, Yao, Song, Yuchen, Li, Shengnan, Shi, Yan, Shen, Shikui, Tang, Xiongyan, Zhang, Min, Wang, Danshi
The rapid development of Generative Artificial Intelligence (GenAI) has catalyzed a transformative technological revolution across all walks of life. As the backbone of wideband communication, optical networks are expecting high-level autonomous operation and zero-touch management to accommodate their expanding network scales and escalating transmission bandwidth. The integration of GenAI is deemed as the pivotal solution for realizing zero-touch optical networks. However, the lifecycle management of optical networks involves a multitude of tasks and necessitates seamless collaboration across multiple layers, which poses significant challenges to the existing single-agent GenAI systems. In this paper, we propose a GenAI-driven hierarchical multi-agent framework designed to streamline multi-task autonomous execution for zero-touch optical networks. We present the architecture, implementation, and applications of this framework. A field-deployed mesh network is utilized to demonstrate three typical scenarios throughout the lifecycle of optical network: quality of transmission estimation in the planning stage, dynamic channel adding/dropping in the operation stage, and system capacity increase in the upgrade stage. The case studies, illustrate the capabilities of multi-agent framework in multi-task allocation, coordination, execution, evaluation, and summarization. This work provides a promising approach for the future development of intelligent, efficient, and collaborative network management solutions, paving the way for more specialized and adaptive zero-touch optical networks.