Goto

Collaborating Authors

 binoculars


Extropic Aims to Disrupt the Data Center Bonanza

WIRED

A startup hopes to challenge Nvidia, AMD, and Intel with a chip that wrangles probabilities rather than ones and zeros. Extropic claims its exotic new chip, called XTR-0, could be thousands of times more energy efficient than existing chips when scaled up. Extropic, a startup developing an exotic new kind of computer chip that handles probabilistic bits, has produced its first working hardware along with proof that more advanced systems will tackle useful tasks in artificial intelligence and scientific research. The startup's chips work in a fundamentally different way to chips from Nvidia, AMD, and others, and promise to be thousands of times more energy efficient when scaled up. With AI companies pouring billions of dollars into building datacenters, a completely new approach could offer a far less costly alternative to vast arrays of conventional chips.


Donald Trump Is the First AI Slop President

WIRED

How do AI videos end up on Donald Trump's social media accounts? President Donald Trump, a septuagenarian known for his general avoidance of keyboards and computers, has somehow become America's first generative AI president. The most infamous example of his experimentation with AI-generated videos came ahead of the No Kings protests earlier this month. In the clip, the president is decked out in full gear, piloting a fighter jet bearing "KING TRUMP" on its side. Instead of a traditional pilot's helmet, however, the president is wearing a literal crown, just in case the rest of the visuals were too subtle.


OnlyFans Goes to Business School

WIRED

In its first foray into business content, the platform has asked lingerie entrepreneur and ex-SuicideGirl Rachael McCrary to teach creators how to monetize their ideas. OnlyFans has tapped the founder of a lingerie company and former nude model to launch business classes on the platform. Rachael McCrary, a longtime lingerie designer and founder of the company Spice Rack, is launching four videos on OnlyFans Wednesday. The videos are quite different from the usual OnlyFans fare. They'll focus on pitching investors, building a brand, and navigating being an entrepreneur as a woman, McCrary tells WIRED.


Gen-Review: A Large-scale Dataset of AI-Generated (and Human-written) Peer Reviews

Demetrio, Luca, Apruzzese, Giovanni, Grosse, Kathrin, Laskov, Pavel, Lupu, Emil, Rimmer, Vera, Widmer, Philine

arXiv.org Artificial Intelligence

How does the progressive embracement of Large Language Models (LLMs) affect scientific peer reviewing? This multifaceted question is fundamental to the effectiveness -- as well as to the integrity -- of the scientific process. Recent evidence suggests that LLMs may have already been tacitly used in peer reviewing, e.g., at the 2024 International Conference of Learning Representations (ICLR). Furthermore, some efforts have been undertaken in an attempt to explicitly integrate LLMs in peer reviewing by various editorial boards (including that of ICLR'25). To fully understand the utility and the implications of LLMs' deployment for scientific reviewing, a comprehensive relevant dataset is strongly desirable. Despite some previous research on this topic, such dataset has been lacking so far. We fill in this gap by presenting GenReview, the hitherto largest dataset containing LLM-written reviews. Our dataset includes 81K reviews generated for all submissions to the 2018--2025 editions of the ICLR by providing the LLM with three independent prompts: a negative, a positive, and a neutral one. GenReview is also linked to the respective papers and their original reviews, thereby enabling a broad range of investigations. To illustrate the value of GenReview, we explore a sample of intriguing research questions, namely: if LLMs exhibit bias in reviewing (they do); if LLM-written reviews can be automatically detected (so far, they can); if LLMs can rigorously follow reviewing instructions (not always) and whether LLM-provided ratings align with decisions on paper acceptance or rejection (holds true only for accepted papers). GenReview can be accessed at the following link: https://anonymous.4open.science/r/gen_review.


Machine Text Detectors are Membership Inference Attacks

Koike, Ryuto, Dugan, Liam, Kaneko, Masahiro, Callison-Burch, Chris, Okazaki, Naoaki

arXiv.org Artificial Intelligence

Although membership inference attacks (MIAs) and machine-generated text detection target different goals, identifying training samples and synthetic texts, their methods often exploit similar signals based on a language model's probability distribution. Despite this shared methodological foundation, the two tasks have been independently studied, which may lead to conclusions that overlook stronger methods and valuable insights developed in the other task. In this work, we theoretically and empirically investigate the transferability, i.e., how well a method originally developed for one task performs on the other, between MIAs and machine text detection. For our theoretical contribution, we prove that the metric that achieves the asymptotically highest performance on both tasks is the same. We unify a large proportion of the existing literature in the context of this optimal metric and hypothesize that the accuracy with which a given method approximates this metric is directly correlated with its transferability. Our large-scale empirical experiments, including 7 state-of-the-art MIA methods and 5 state-of-the-art machine text detectors across 13 domains and 10 generators, demonstrate very strong rank correlation (rho > 0.6) in cross-task performance. We notably find that Binoculars, originally designed for machine text detection, achieves state-of-the-art performance on MIA benchmarks as well, demonstrating the practical impact of the transferability. Our findings highlight the need for greater cross-task awareness and collaboration between the two research communities. To facilitate cross-task developments and fair evaluations, we introduce MINT, a unified evaluation suite for MIAs and machine-generated text detection, with implementation of 15 recent methods from both tasks.


VietBinoculars: A Zero-Shot Approach for Detecting Vietnamese LLM-Generated Text

Nguyen, Trieu Hai, Akilesh, Sivaswamy

arXiv.org Artificial Intelligence

The rapid development research of Large Language Models (LLMs) based on transformer architectures raises key challenges, one of them being the task of distinguishing between human-written text and LLM-generated text. As LLM-generated textual content, becomes increasingly complex over time, and resembles human writing, traditional detection methods are proving less effective, especially as the number and diversity of LLMs continue to grow with new models and versions being released at a rapid pace. This study proposes VietBinoculars, an adaptation of the Binoculars method with optimized global thresholds, to enhance the detection of Vietnamese LLM-generated text. We have constructed new Vietnamese AI-generated datasets to determine the optimal thresholds for VietBinoculars and to enable benchmarking. The results from our experiments show results show that VietBinoculars achieves over 99\% in all two domains of accuracy, F1-score, and AUC on multiple out-of-domain datasets. It outperforms the original Binoculars model, traditional detection methods, and other state-of-the-art approaches, including commercial tools such as ZeroGPT and DetectGPT, especially under specially modified prompting strategies.


The 15 Best Tried and Tested Gifts for Bird Lovers (2025)

WIRED

Whether your giftee loves bird watching, bird feeding, or just putting a bird on it, these gifts are sure to be a hit. All products featured on WIRED are independently selected by our editors. However, we may receive compensation from retailers and/or from purchases of products through these links. You may be familiar with the various memes detailing the fact that once you reach middle age, you're automatically sorted, Harry Potter hat-style, into one of a handful of hobbies, such as sourdough bread making, gardening, or bird watching. I can't contradict this, since I'm a middle-aged person who got sorted into bird-watching.


T-Detect: Tail-Aware Statistical Normalization for Robust Detection of Adversarial Machine-Generated Text

West, Alva, Zhang, Luodan, Zhang, Liuliu, Zhu, Minjun, Weng, Yixuan, Zhang, Yue

arXiv.org Artificial Intelligence

Large language models (LLMs) have shown the capability to generate fluent and logical content, presenting significant challenges to machine-generated text detection, particularly text polished by adversarial perturbations such as paraphrasing. Current zero-shot detectors often employ Gaussian distributions as statistical measure for computing detection thresholds, which falters when confronted with the heavy-tailed statistical artifacts characteristic of adversarial or non-native English texts. In this paper, we introduce T-Detect, a novel detection method that fundamentally redesigns the curvature-based detectors. Our primary innovation is the replacement of standard Gaussian normalization with a heavy-tailed discrepancy score derived from the Student's t-distribution. This approach is theoretically grounded in the empirical observation that adversarial texts exhibit significant leptokurtosis, rendering traditional statistical assumptions inadequate. T-Detect computes a detection score by normalizing the log-likelihood of a passage against the expected moments of a t-distribution, providing superior resilience to statistical outliers. We validate our approach on the challenging RAID benchmark for adversarial text and the comprehensive HART dataset. Experiments show that T-Detect provides a consistent performance uplift over strong baselines, improving AUROC by up to 3.9\% in targeted domains. When integrated into a two-dimensional detection framework (CT), our method achieves state-of-the-art performance, with an AUROC of 0.926 on the Books domain of RAID. Our contributions are a new, theoretically-justified statistical foundation for text detection, an ablation-validated method that demonstrates superior robustness, and a comprehensive analysis of its performance under adversarial conditions. Ours code are released at https://github.com/ResearAI/t-detect.


In high-tech race to detect fires early, O.C. bets on volunteers with binoculars

Los Angeles Times

As California turns to satellite imagery, remote cameras watched by AI and heat detection sensors placed throughout wildlands to detect fires earlier, one Orange County group is keeping it old-school. Whenever the National Weather Service issues a red flag warning, a sign that dangerous fire weather is imminent, Renalynn Funtanilla swiftly sends alerts to her more than 300 volunteers' phones and inboxes. She wheels TVs into a conference room turned makeshift command center, sets up computers and phones around the table and dispatches volunteers to dozens of trailheads and roadways in Orange County's wildland-urban interface: likely spots for the county's next devastating fire to erupt. The volunteers -- sporting bright yellow vests and navy blue hats with an "Orange County Fire Watch" emblem -- slap large fire watch magnets to the sides of their vehicles, grab some binoculars and start to watch. Amid California's coastal sage scrub and chaparral ecosystems that are plagued with frequent fast-moving fires, preventing ignitions and stamping out fires before they become unmanageable is the name of the game.


DivScore: Zero-Shot Detection of LLM-Generated Text in Specialized Domains

Chen, Zhihui, He, Kai, Huang, Yucheng, Zhu, Yunxiao, Feng, Mengling

arXiv.org Artificial Intelligence

Detecting LLM-generated text in specialized and high-stakes domains like medicine and law is crucial for combating misinformation and ensuring authenticity. However, current zero-shot detectors, while effective on general text, often fail when applied to specialized content due to domain shift. We provide a theoretical analysis showing this failure is fundamentally linked to the KL divergence between human, detector, and source text distributions. To address this, we propose DivScore, a zero-shot detection framework using normalized entropy-based scoring and domain knowledge distillation to robustly identify LLM-generated text in specialized domains. We also release a domain-specific benchmark for LLM-generated text detection in the medical and legal domains. Experiments on our benchmark show that DivScore consistently outperforms state-of-the-art detectors, with 14.4% higher AUROC and 64.0% higher recall (0.1% false positive rate threshold). In adversarial settings, DivScore demonstrates superior robustness than other baselines, achieving on average 22.8% advantage in AUROC and 29.5% in recall. Code and data are publicly available.