Industry
Snap's slimmed down AR Specs go on sale later this year for 2,195
Snap's slimmed down AR Specs go on sale later this year for $2,195 Snap's slimmed down AR Specs go on sale later this year for $2,195 It's the first time the company will be selling its AR glasses to the public. It's been almost five years since Snap began experimenting with standalone augmented reality glasses. Now, the company is finally ready to start selling them to the public, though they won't be cheap. During a keynote today at Augmented World Expo in Long Beach, California, Snap CEO Evan Spiegel unveiled the company's latest AR Specs that he said represent the beginning of a new era in computing. The $2,195 glasses come with a wider field of view, upgraded battery life and, perhaps most importantly, a slimmed down form factor compared with the oversized, developer-only glasses it showed off in 2024 .
Diffusion Guided Adversarial State Perturbations in Reinforcement Learning
Reinforcement learning (RL) systems, while achieving remarkable success across various domains, are vulnerable to adversarial attacks. This is especially a concern in vision-based environments where minor manipulations of high-dimensional image inputs can easily mislead the agent's behavior. To this end, various defenses have been proposed recently, with state-of-the-art approaches achieving robust performance even under large state perturbations. However, after closer investigation, we found that the effectiveness of the current defenses is due to a fundamental weakness of the existing lp norm-constrained attacks, which can barely alter the semantics of image input even under a relatively large perturbation budget. In this work, we propose SHIFT, a novel policy-agnostic diffusion-based state perturbation attack to go beyond this limitation. Our attack is able to generate perturbed states that are semantically different from the true states while remaining realistic and history-aligned to avoid detection. Evaluations show that our attack effectively breaks existing defenses, including the most sophisticated ones, significantly outperforming existing attacks while being more perceptually stealthy.
Russian warship fires warning shots near UK-registered yacht in Channel
'It was surreal': British couple describe having warning shots fired near them by Russian warship A retired British couple who were on a yacht which had warning shots fired near it by a Russian warship in the English Channel have told the BBC the experience was surreal. Jane and Alan Kelvey were sailing 23 miles (37km) off the Isle of Wight in international waters when they came into close contact with the Russian frigate, the Admiral Grigorovich on Tuesday. Sir Keir Starmer said firing shots into the path of a UK-registered yacht was reckless - an incident the Ministry of Defence has described as an isolated one. Russia's Defence Ministry said the yacht had been on a dangerous approach towards the warship but the couple said they were not on a collision course. The incident comes days after Royal Marine Commandos intercepted a Russian shadow fleet tanker carrying sanctioned oil in the Channel on Sunday, in the first operation of its kind carried out by the British military.
Dimensionality Mismatch Between Brains and Artificial Neural Networks
Biological and artificial vision systems both rely on hierarchical architectures, yet it remains unclear how their representational geometry evolves across processing stages, and what functional consequences may arise from potential differences. In this work, we systematically quantify and compare the linear and nonlinear dimensionality of human brain activity (fMRI) and artificial neural networks (ANNs) during natural image viewing. In the human ventral visual stream, both dimensionality measures increase along the visual hierarchy, supporting the emergence of semantic and abstract representations. For linear dimensionality, most ANNs show a similar increase, but only for pooled features, emphasizing the importance of appropriate feature readouts in brain-model comparisons. In contrast, nonlinear dimensionality shows a collapse in the later layers of ANNs, pointing at a mismatch in representational geometry between the human and artificial visual systems. This mismatch may have functional consequences: while high-dimensional brain representations support flexible generalization to abstract features, ANNs appear to lose this capacity in later layers, where their representations become overly compressed. Overall, our findings propose dimensionality alignment as a benchmark for building more flexible and biologically grounded vision models.
On the SAC-BL Algorithm for Anomaly Detection
Visual anomaly detection is significant in safety-critical and reliability-sensitive scenarios. Prior studies mainly emphasize the design and training of scoring functions, while little effort has been devoted to constructing decision rules based on these score functions. A recent work Ma et al. (2025b) highlights this issue and proposes the SAC-BL algorithm to address it. This method consists of a strong anomaly constraint (SAC) network and a betting-like (BL) algorithm serving as the decision rule. The SAC-BL algorithm can control the false discovery rate (FDR).
Neural Collapse is Globally Optimal in Deep Regularized ResNets and Transformers
The empirical emergence of neural collapse--a surprising symmetry in the feature representations of the training data in the penultimate layer of deep neural networks--has spurred a line of theoretical research aimed at its understanding. However, existing work focuses on data-agnostic models or, when data structure is taken into account, it remains limited to multi-layer perceptrons. Our paper fills both these gaps by analyzing modern architectures in a data-aware regime: we prove that global optima of deep regularized transformers and residual networks (ResNets) with LayerNorm trained with cross entropy or mean squared error loss are approximately collapsed, and the approximation gets tighter as the depth grows. More generally, we formally reduce any end-to-end large-depth ResNet or transformer training into an equivalent unconstrained features model, thus justifying its wide use in the literature even beyond data-agnostic settings. Our theoretical results are supported by experiments on computer vision and language datasets showing that, as the depth grows, neural collapse indeed becomes more prominent.
HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages
Preference datasets are essential for training general-domain, instruction-following language models with Reinforcement Learning from Human Feedback (RLHF). Each subsequent data release raises expectations for future data collection, meaning there is a constant need to advance the quality and diversity of openly available preference data. To address this need, we introduce HelpSteer3-Preference, a permissively licensed (CC-BY-4.0),
Accelerating Chain of Thought Reasoning through Semantically Aligned Implicit Tokens
Chain-of-Thought (CoT) enhances the performance of Large Language Models (LLMs) on reasoning tasks by encouraging step-by-step solutions. However, the verbosity of CoT reasoning hinders its mass deployment in efficiency-critical applications. Recently, implicit CoT approaches have emerged, which encode reasoning steps within LLM's hidden embeddings (termed "implicit reasoning") rather than explicit tokens. This approach accelerates CoT reasoning by reducing the reasoning length and bypassing some LLM components. However, existing implicit CoT methods face two significant challenges: (1) they fail to preserve the semantic alignment between the implicit reasoning (when transformed to natural language) and the ground-truth reasoning, resulting in a significant CoT performance degradation, and (2) they focus on reducing the length of the implicit reasoning; however, they neglect the considerable time cost for an LLM to generate one individual implicit reasoning token.
AgentAuditor: Human-Level Safety and Security Evaluation for LLMAgents
Despite the rapid advancement of LLM-based agents, the reliable evaluation of their safety and security remains a significant challenge. Existing rule-based or LLM-based evaluators often miss dangers in agents' step-by-step actions, overlook subtle meanings, fail to see how small issues compound, and get confused by unclear safety or security rules. To overcome this evaluation crisis, we introduce AgentAuditor, a universal, training-free, memory-augmented reasoning framework that empowers LLM evaluators to emulate human expert evaluators. AgentAuditor constructs an experiential memory by having an LLM adaptively extract structured semantic features (e.g., scenario, risk, behavior) and generate associated chain-of-thought reasoning traces for past interactions. A multi-stage, contextaware retrieval-augmented generation process then dynamically retrieves the most relevant reasoning experiences to guide the LLM evaluator's assessment of new cases. Moreover, we develop ASSEBench, the first benchmark designed to check how well LLM-based evaluators can spot both safety risks and security threats. ASSEBench comprises 2293 meticulously annotated interaction records, covering 15 risk types across 29 application scenarios. A key feature of ASSEBench is its nuanced approach to ambiguous risk situations, employing "Strict" and "Lenient" judgment standards. Experiments demonstrate that AgentAuditor not only consistently improves the evaluation performance of LLMs across all benchmarks but also sets a new state-of-the-art in LLM-as-a-judge for agent safety and security, achieving human-level accuracy.