Industry
Position: Benchmarking is Broken - Don't Let AI be Its Own Judge
The meteoric rise of Artificial Intelligence (AI), with its rapidly expanding market capitalization, presents both transformative opportunities and critical challenges. Chief among these is the urgent need for a new, unified paradigm for trustworthy evaluation, as current benchmarks increasingly reveal critical vulnerabilities. Issues like data contamination and selective reporting by model developers fuel hype, while inadequate data quality control can lead to biased evaluations that, even if unintentionally, may favor specific approaches. As a flood of participants enters the AI space, this Wild West of assessment makes distinguishing genuine progress from exaggerated claims exceptionally difficult. Such ambiguity blurs scientific signals and erodes public confidence, much as unchecked claims would destabilize financial markets reliant on credible oversight from agencies like Moody's.In high-stakes human examinations (e.g., SAT, GRE), substantial effort is devoted to ensuring fairness and credibility; why settle for less in evaluating AI, especially given its profound societal impact? This position paper argues that a laissez-faire approach is untenable. For true and sustainable AI advancement, we call for a paradigm shift to a unified, live, and quality-controlled benchmarking framework--robust by construction rather than reliant on courtesy or goodwill.
GeForce Now's best tier just got a 70 price cut, but the clock is ticking
Nvidia GeForce Now is offering significant discounts on yearly subscriptions, with the Ultimate tier reduced to $130 annually, saving $70. PCWorld highlights this limited-time promotion runs until July 8th, making cloud gaming more accessible for budget-conscious users. The service enables streaming PC games from existing libraries on various devices without requiring powerful hardware. Nvidia's GeForce Now streaming service is a great way to make use of a big Steam library without needing a beefy gaming PC. That's becoming a much more appealing option, as prices for RAM and storage become untenable ( thanks, in no small part, to Nvidia). If you're thinking about signing up, Nvidia is offering up to $70 off a yearly subscription, but only for the next month or so. The "Summer Sale" brings the price of the Ultimate tier down to $130 for a year, and the Performance tier down to $65.
Watermarking Autoregressive Image Generation
Watermarking the outputs of generative models has emerged as a promising approach for tracking their provenance. Despite significant interest in autoregressive image generation models and their potential for misuse, no prior work has attempted to watermark their outputs at the token level. In this work, we present the first such approach by adapting language model watermarking techniques to this setting. We identify a key challenge: the lack of reverse cycle-consistency (RCC), wherein re-tokenizing generated image tokens significantly alters the token sequence, effectively erasing the watermark. To address this and to make our method robust to common image transformations, neural compression, and removal attacks, we introduce (i) a custom tokenizer-detokenizer finetuning procedure that improves RCC, and (ii) a complementary watermark synchronization layer. As our experiments demonstrate, our approach enables reliable and robust watermark detection with theoretically grounded p-values.
Beatbot Sora 10 review: The affordable pool robot most people need
When you purchase through links in our articles, we may earn a small commission. A budget pool robot that handles basic cleaning well enough, but it stands out most for how affordable it is. Beatbot's Sora line, introduced earlier this year, marked the robot producer's aggressive foray into lower-cost pool cleaning systems, with three models on sale at stair-stepped price points. The Sora 10 stands at the bottom of that price band, typically available for under $500, which is pretty much the bare minimum you can get away with paying for a pool robot that has any real value. So, what does $500 get you?
Crypto Guys Bought the Answer to the CIA's Mysterious Kryptos Sculpture
They swear they haven't peeked at the closely guarded secret and that they'll keep the cryptographic competition going. On a blustery March day, the artist Jim Sanborn received visitors at his studio on an isolated island in the Chesapeake Bay. The visitors sat him down in front of a laptop, and he typed in a secret message. They compressed the message using a unique hash function, sent that to the cloud, and wiped the laptop clean. Sanborn hoped that this action would set him free.
Attack by Yourself: Effective and Unnoticeable Multi-Category Graph Backdoor Attacks with Subgraph Triggers Pool
Graph Neural Networks (GNNs) have achieved significant success in various real-world applications, including social networks, finance systems, and traffic management. Recent researches highlight their vulnerability to backdoor attacks in node classification, where GNNs trained on a poisoned graph misclassify a test node only when specific triggers are attached. These studies typically focus on single attack categories and use adaptive trigger generators to create node-specific triggers. However, adaptive trigger generators typically have a simple structure, limited parameters, and lack category-aware graph knowledge, which makes them struggle to handle backdoor attacks across multiple categories as the number of target categories increases. We address this gap by proposing a novel approach for Effective and Unnoticeable Multi-Category (EUMC) graph backdoor attacks, leveraging subgraph from the attacked graph as category-aware triggers to precisely control the target category. To ensure the effectiveness of our method, we construct a Multi-Category Subgraph Triggers Pool (MC-STP) using the subgraphs of the attacked graph as triggers. We then exploit the attachment probability shifts of each subgraph trigger as category-aware priors for target category determination. Moreover, we develop a ``select then attach'' strategy that connects suitable category-aware trigger to attacked nodes for unnoticeability. Extensive experiments across different real-world datasets confirm the efficacy of our method in conducting multi-category graph backdoor attacks on various GNN models and defense strategies.
Far from the Shallow: Brain-Predictive Reasoning Embedding through Residual Disentanglement
Understanding how the human brain progresses from processing simple linguistic inputs to performing high-level reasoning is a fundamental challenge in neuroscience. While modern large language models (LLMs) are increasingly used to model neural responses to language, their internal representations are highly entangled, mixing information about lexicon, syntax, meaning, and reasoning. This entanglement biases conventional brain encoding analyses toward linguistically shallow features (e.g., lexicon and syntax), making it difficult to isolate the neural substrates of cognitively deeper processes. Here, we introduce a residual disentanglement method that computationally isolates these components. By first probing an LM to identify feature-specific layers, our method iteratively regresses out lower-level representations to produce four nearly orthogonal embeddings for lexicon, syntax, meaning, and, critically, reasoning. We used these disentangled embeddings to model intracranial (ECoG) brain recordings from neurosurgical patients listening to natural speech. We show that: 1) This isolated reasoning embedding exhibits unique predictive power, accounting for variance in neural activity not explained by other linguistic features and even extending to the recruitment of visual regions beyond classical language areas.
Congratulations to the #AAMAS2026 best paper award winners
The AAMAS 2026 best paper awards were presented at the 25th International Conference on Autonomous Agents and Multiagent Systems, which took place from 25-29 May 2025 in Paphos, Cyprus. Lucy Smith is Senior Managing Editor for Robohub and AIhub. Lucy Smith is Senior Managing Editor for Robohub and AIhub. In this special live recording at the Great Exhibition Road Festival in London, Claire chatted to George Mylonas (Imperial College London), Antonia Tzemanaki (University of Bristol) and Tom Vercauteren (King's College London) about robotics and AI in medicine and healthcare. Researchers are developing AI models that could one day enable vision prosthetics able to restore meaningful, object-level sight for the blind.
Still paying for cable? These simple tips can lower your bill
PCWorld highlights strategies to reduce cable bills without canceling service, including using provider streaming apps and negotiating better rates. Cable companies like Comcast, Spectrum, and DirecTV offer free streaming apps that can save $7-15 monthly per TV by eliminating set-top box rentals. Threatening to cancel service often unlocks significant discounts, while bundled streaming services through providers offer additional savings opportunities.