Industry
Bipolar Self-attention for Spiking Transformers
Harnessing the event-driven characteristic, Spiking Neural Networks (SNNs) present a promising avenue toward energy-efficient Transformer architectures. However, existing Spiking Transformers still suffer significant performance gaps compared to their Artificial Neural Network counterparts. Through comprehensive analysis, we attribute this gap to these two factors. First, the binary nature of spike trains limits Spiking Self-attention (SSA)'s capacity to capture negative-negative and positive-negative membrane potential interactions on Querys and Keys. Second, SSA typically omits Softmax functions to avoid energy-intensive multiply-accumulate operations, thereby failing to maintain row-stochasticity constraints on attention scores.
RAM-W600: A Multi-Task Wrist Dataset and Benchmark for Rheumatoid Arthritis
Rheumatoid arthritis (RA) is a common autoimmune disease that has been the focus of research in computer-aided diagnosis (CAD) and disease monitoring. In clinical settings, conventional radiography (CR) is widely used for the screening and evaluation of RA due to its low cost and accessibility. The wrist is a critical region for the diagnosis of RA. However, CAD research in this area remains limited, primarily due to the challenges in acquiring high-quality instance-level annotations.
Neural-Driven Image Editing
Traditional image editing typically relies on manual prompting, making it labor-intensive and inaccessible to individuals with limited motor control or language abilities. Leveraging recent advances in brain-computer interfaces (BCIs) and generative models, we propose LoongX, a hands-free image editing approach driven by multimodal neurophysiological signals. LoongX utilizes state-of-the-art diffusion models trained on a comprehensive dataset of 23,928 image editing pairs, each paired with synchronized electroencephalography (EEG), functional near-infrared spectroscopy (fNIRS), photoplethysmography (PPG), and head motion signals that capture user intent. To effectively address the heterogeneity of these signals, LoongX integrates two key modules.
OpenAI is facing investigation from a group of state attorneys general
The company says it will'engage constructively' with them. OpenAI is under investigation by a coalition of state attorneys general, according to the Wall Street Journal . On Friday, June 12, the company received a subpoena seeking information and documents related to its activities and impact on users. said it viewed the subpoena sent by New York's attorney general. Based on what the publication saw, the AGs are asking for documentation about the company's advertising, user engagement and retention, as well as its handling of its users' data and health information. They also want to know about the company's activities related to minor and senior users, its deep learning models, its policies and its models' sycophancy.
Efficient Training of Minimal and Maximal Low-Rank Recurrent Neural Networks
Low-rank recurrent neural networks (RNNs) provide a powerful framework for characterizing how neural systems solve complex cognitive tasks. However, fitting and interpreting these networks remains an important open problem. In this paper, we develop new methods for efficiently fitting low-rank RNNs in ''teacher-training'' settings. In particular, we build upon the neural engineering framework (NEF), in which RNNs are viewed as approximating an ordinary differential equation (ODE) of interest using a set of random nonlinear basis functions. This view provides geometric insight into how the choice of neural nonlinearity (e.g.
Here's How AI Agents Can Protect EV Chargers
An AI agent system proposed by researchers in Spain promises to prevent energy theft and damage to EV chargers, as well as the critical energy infrastructure that powers them. The number of electric vehicles on roads around the world continues to grow. The boom in EV adoption has driven the development of accessible, fast, and efficient charging infrastructure. However, this expansion also brings with it new cybersecurity risks that have been not been widely studied, and for which there are still few viable solutions. Cristina Alcaraz, an infrastructure-security researcher at Spain's University of Malaga, explains that the liability of electric-vehicle charging stations is due to the fact that they integrate multiple physical and digital components.
Mamba Goes HoME: Hierarchical Soft Mixture-of-Experts for 3D Medical Image Segmentation
In recent years, artificial intelligence has significantly advanced medical image segmentation. Nonetheless, challenges remain, including efficient 3D medical image processing across diverse modalities and handling data variability. In this work, we introduce Hierarchical Soft Mixture-of-Experts (HoME), a two-level token-routing layer for efficient long-context modeling, specifically designed for 3D medical image segmentation. Built on the Mamba Selective State Space Model (SSM) backbone, HoME enhances sequential modeling through adaptive expert routing.
Real-Time Scene-Adaptive Tone Mapping for High-Dynamic Range Object Detection
High dynamic range (HDR) images, with their rich tone and detail reproduction, hold significant potential to enhance computer vision systems, particularly in autonomous driving. However, most neural networks for embedded vision are trained on low dynamic range (LDR) inputs and suffer substantial performance degradation when handling high-bit-depth HDR images due to the challenges posed by extreme dynamic ranges. In this paper, we propose a novel tone mapping method that not only bridges the gap between HDR RAW inputs and the LDR sRGB requirements of detection networks but also achieves end-to-end optimization with the downstream tasks. Instead of relying on traditional image signal processing (ISP) pipeline, we introduce neural photometric calibration to regularize dynamic ranges and a scaling-invariant local tone mapping module to preserve image details. In addition, our architecture also supports performance transfer finetuning, enabling efficient adaptation from the LDR model to the HDR RAW model with minimal cost. The proposed method outperforms traditional tone mapping algorithms and advanced AI-ISP methods in challenging automotive HDR scenes. Moreover, our pipeline achieves real-time processing of 4K high-bit-depth HDR inputs on the Nvidia Jetson platform.
MPCache: MPC-Friendly KV Cache Eviction for Efficient Private LLM Inference
Private large language model (LLM) inference based on secure multi-party computation (MPC) achieves formal data privacy protection but suffers from significant latency overhead, especially for long input sequences. While key-value (KV) cache eviction and sparse attention algorithms have been proposed for efficient LLM inference in plaintext, they are not designed for MPC and cannot benefit private LLM inference directly. In this paper, we propose an accurate and MPC-friendly KV cache eviction framework, dubbed MPCache, building on the observation that historical tokens in a long sequence may have different effects on the downstream decoding. Hence, MPCache combines a look-once static eviction algorithm to discard unimportant KV cache and a query-aware dynamic selection algorithm to activate only a small subset of KV cache for attention computation. MPCache further incorporates a series of optimizations for efficient dynamic KV cache selection, including MPC-friendly similarity approximation, hierarchical KV cache clustering, and cross-layer index-sharing strategy. Extensive experiments demonstrate that MPCache consistently outperforms prior-art KV cache eviction baselines across different generation tasks and achieves 1.8 ~ 2.01x and 3.39 ~ 8.37x decoding latency and communication reduction on different sequence lengths, respectively.
Uncover Governing Law of Pathology Propagation Mechanism Through A Mean-Field Game
Alzheimer's disease (AD) is marked by cognitive decline along with the widespread of tau aggregates across the brain cortex. Due to the challenges of imaging pathology spreading flows \textit{in vivo}, however, quantitative analysis on the cortical pathways of tau propagation and its interaction with the cascade of amyloid-beta (A$\beta$) plaques lags behind the experimental insights of underlying pathophysiological mechanisms. To address this challenge, we present a physics-informed neural network, empowered by mean-field theory, to uncover the biologically meaningful spreading pathways of tau aggregates between two longitudinal snapshots. Following the notion of `prion-like' mechanism in AD, we first formulate the dynamics of tau propagation as a mean-field game (MFG), where the spread of tau aggregate at each location (aka.