Goto

Collaborating Authors

 Country


ChunkKV Semantic Preserving Compression for Efficient Long Context LLM Inference

Neural Information Processing Systems

Large Language Models (LLMs) require significant GPU memory when processing long texts, with the key value (KV) cache consuming up to 70% of total memory during inference. Although existing compression methods reduce memory by evaluating the importance of individual tokens, they overlook critical semantic relationships between tokens, resulting in fragmented context and degraded performance. We introduce ChunkKV, which fundamentally reimagines KV cache compression by treating semantic chunks - rather than isolated tokens - as basic compression units. This approach preserves complete linguistic structures and contextual integrity, ensuring that essential meaning is retained even under aggressive compression. Our innovation includes a novel layer-wise index reuse technique that exploits the higher cross-layer similarity of preserved indices in ChunkKV, reducing computational overhead and improving throughput by 26.5%. Comprehensive evaluations on challenging benchmarks: LongBench, Needle-InA-HayStack, GSM8K, and JailbreakV demonstrate that ChunkKV outperforms state-of-the-art methods by up to 8.7% in precision while maintaining the same compression ratio. These results confirm that semantic-aware compression significantly enhances both efficiency and performance for long-context LLM inference, providing a simple yet effective solution to the memory bottleneck problem. The code is available at link.


Spike4DGS: Towards High-Speed Dynamic Scene Recontruction with 4DGaussian Splatting via a Spike Camera Array

Neural Information Processing Systems

Spike camera with high temporal resolution offers a new perspective on highspeed dynamic scene rendering. Most existing rendering methods rely on Neural Radiance Fields (NeRF) or 3DGaussian Splatting (3DGS) for static scenes using a monocular spike camera. However, these methods struggle with dynamic motion, while a single camera suffers from limited spatial coverage, making it challenging to reconstruct fine details in high-speed scenes. To address these problems, we propose Spike4DGS, the first high-speed dynamic scene rendering framework with 4DGaussian Splatting using spike camera arrays. Technically, we first build a multi-view spike camera array to validate our solution, then establish both synthetic and real-world multi-view spike-based reconstruction datasets. Then, we design a multi-view spike-based dense initialization module that obtains dense point clouds and camera poses from continuous spike streams. Finally, we propose a spikepixel synergy constraint supervision to optimize Spike4DGS, incorporating both rendered image quality loss and dynamic spatiotemporal spike loss. The results show that our Spike4DGS outperforms state-of-the-art methods in terms of novel view rendering quality on both synthetic and real-world datasets. More details are available at the project page.


BlurGuard Approach for Image Protection Against AI Powered Editing

Neural Information Processing Systems

Recent advances in text-to-image models have increased the exposure of powerful image editing techniques as a tool, raising concerns about their potential for malicious use. An emerging line of research to address such threats focuses on implanting ("protective") adversarial noise into images before their public release, so future attempts to edit them using text-to-image models can be impeded. However, subsequent works have shown that these adversarial noises are often easily "reversed," e.g., with techniques as simple as JPEG compression, casting doubt on the practicality of the approach. In this paper, we argue that adversarial noise for image protection should not only be imperceptible, as has been a primary focus of prior work, but also irreversible, viz., it should be difficult to detect as noise provided that the original image is hidden. We propose a surprisingly simple method to enhance the robustness of image protection methods against noise reversal techniques. Specifically, it applies an adaptive per-region Gaussian blur on the noise to adjust the overall frequency spectrum. Through extensive experiments, we show that our method consistently improves the per-sample worst-case protection performance of existing methods against a wide range of reversal techniques on diverse image editing scenarios, while also reducing quality degradation due to noise in terms of perceptual metrics.


Put your name aboard NASA's Nancy Grace Roman Space Telescope

Popular Science

Science Space Deep Space Space Telescope Put your name aboard NASA's Nancy Grace Roman Space Telescope The next generation space observatory is scheduled to launch in August. More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results. The NASA observatory was designed to settle essential questions in the areas of dark energy, exoplanets, and infrared astrophysics. Roman's barrel-like shape will help block out unwanted light from the sun, Earth, and moon, and the spacecraft's distant location will help keep the instruments cool. Breakthroughs, discoveries, and DIY tips sent six days a week.


TCL A65K Soundbar Review: Small Size, Big Sound

WIRED

Don't be fooled by the compact size of this soundbar. It's a solid option for smaller TVs or spaces without having to sacrifice sound quality. Acoustic music sounds loud and distinct. Some music sounds washed out and muddy. Living in a small space has some challenges, but poor cinematic sound doesn't need to be one of them.


RSafe: Incentivizing proactive reasoning to build robust and adaptive LLM safeguards

Neural Information Processing Systems

Large Language Models (LLMs) continue to exhibit vulnerabilities despite deliberate safety alignment efforts, posing significant risks to users and society. To safeguard against the risk of policy-violating content, system-level moderation via external guard models--designed to monitor LLM inputs and outputs and block potentially harmful content--has emerged as a prevalent mitigation strategy. Existing approaches of training guard models rely heavily on extensive human curated datasets and struggle with out-of-distribution threats, such as emerging harmful categories or jailbreak attacks. To address these limitations, we propose RSafe, an adaptive reasoning-based safeguard that conducts guided safety reasoning to provide robust protection within the scope of specified safety policies. RSafe operates in two stages: (1) guided reasoning, where it analyzes safety risks of input content through policy-guided step-by-step reasoning, and (2) reinforced alignment, where rule-based RL optimizes its reasoning paths to align with accurate safety prediction.


AI Doesn't Feel. So Why Does It Have Something Like Emotions?

TIME - Tech

Follow this section to personalize your feed and get instant alerts. Follow Go to your personalized feed WHY FOLLOW? Smart Alerts: Get notified about major news as it happens. Follow this tag to personalize your feed and get instant alerts. Follow Go to your personalized feed WHY FOLLOW?


AR-RAG: Autoregressive Retrieval Augmentation for Image Generation

Neural Information Processing Systems

W paradigm e introduce that enhances Autoregressi image ve Retrie generation val Augmentation by autoregressi ( v A ely R-R incorporating AG), a novel knearest neighbor retrievals at the patch level. Unlike prior methods that perform a fix single, ed reference static retrie images, val before AR-RA generation G performs and conte condition xt-aware the retrie entire vals generation at each genon eration step, using prior-generated patches as queries to retrieve and incorporate the evolving most rele generation vant patch-le needs vel while visual avoiding references, limitations enabling (e.g., the o model ver-cop to ying, respond stylisto tic bias, etc.) prevalent in existing methods. To realize AR-RAG, we propose two parallel frameworks: (1) Distribution-Augmentation in Decoding (DAiD), a tion training-free of model-predicted plug-and-use patches decoding with the strate distrib gy that ution directly of retrie mer v ges ed patches, the distrib and u(2) Feature-Augmentation in Decoding (FAiD), a parameter-efficient fine-tuning method convolution that progressi operations vely and smooths leverages the them features to augment of retriev the ed patches image generation via multi-scale process.


Top-HDecoding: Adapting the Creativity and Coherence with Bounded Entropy in Text Generation

Neural Information Processing Systems

Large language models (LLMs), despite their impressive performance across a wide range of tasks, often struggle to balance two competing objectives in openended text generation: fostering diversity and creativity while preserving logical coherence. Existing truncated sampling techniques, including temperature scaling, top-p (nucleus) sampling, and min-p sampling, aim to manage this trade-off.