Goto

Collaborating Authors

 Industry


TokenSqueeze: Performance-Preserving Compression for Reasoning LLMs

Neural Information Processing Systems

Emerging reasoning LLMs such as OpenAI-o1 and DeepSeek-R1 have achieved strong performance on complex reasoning tasks by generating long chain-ofthought (CoT) traces. However, these long CoTs result in increased token usage, leading to higher inference latency and memory consumption. As a result, balancing accuracy and reasoning efficiency has become essential for deploying reasoning LLMs in practical applications. Existing long-to-short (Long2Short) methods aim to reduce inference length but often sacrifice accuracy, revealing a need for an approach that maintains performance while lowering token costs. To address this efficiency-accuracy tradeoff, we propose TokenSqueeze, a novel Long2Short method that condenses reasoning paths while preserving performance and relying exclusively on self-generated data. First, to prevent performance degradation caused by excessive compression of reasoning depth, we propose to select self-generated samples whose reasoning depth is adaptively matched to the complexity of the problem. To further optimize the linguistic expression without altering the underlying reasoning paths, we introduce a distribution-aligned linguistic refinement method that enhances the clarity and conciseness of the reasoning path while preserving its logical integrity. Comprehensive experimental results demonstrated the effectiveness of TokenSqueeze in reducing token usage while maintaining accuracy. Notably, DeepSeek-R1-Distill-Qwen-7B fine-tuned by using our proposed method achieved a 50% average token reduction while preserving accuracy on the MATH500 benchmark.


AdaSTaR Adaptive Data Sampling for Training Self Taught Reasoners

Neural Information Processing Systems

Self-Taught Reasoners (STaR), synonymously known as Rejection sampling FineTuning (RFT), is an integral part of the training pipeline of self-improving reasoning Language Models (LMs). The self-improving mechanism often employs random observation (data) sampling. However, this results in trained observation imbalance; inefficiently over-training on solved examples while under-training on challenging ones.


Inductive Domain Transfer In Misspecified Simulation-Based Inference

Neural Information Processing Systems

Simulation-based inference (SBI) of latent parameters in physical systems is often hindered by model misspecification-the mismatch between simulated and real-world observations caused by inherent modeling simplifications. RoPE, a recent SBI approach, addresses this challenge through a two-stage domain transfer process that combines semi-supervised calibration with optimal transport (OT)based distribution alignment. However, RoPE operates in a fully transductive setting, requiring access to a batch of test samples at inference time, which limits scalability and generalization. We propose a fully inductive and amortized SBI framework that integrates calibration and distributional alignment into a single, end-to-end trainable model called FRISBI. Our method leverages mini-batch OT with a closed-form coupling to align real and simulated observations that correspond to the same latent parameters, using both paired calibration data and unpaired samples. A conditional normalizing flow is then trained to approximate the OTinduced posterior, enabling efficient inference without simulation access at test time. Across a range of synthetic and real-world benchmarks-including complex medical biomarker estimation-our approach matches or exceeds the performance of RoPE, while offering improved scalability and applicability in challenging, misspecified environments.


Cloud4D: Estimating Cloud Properties at a High Spatial and Temporal Resolution

Neural Information Processing Systems

There has been great progress in improving numerical weather prediction and climate models using machine learning. However, most global models act at a kilometer-scale, making it challenging to model individual clouds and factors such as extreme precipitation, wind gusts, turbulence, and surface irradiance. Therefore, there is a need to move towards higher-resolution models, which in turn require high-resolution real-world observations that current instruments struggle to obtain. We present Cloud4D, the first learning-based framework that reconstructs a physically consistent, four-dimensional cloud state using only synchronized ground-based cameras.


High Dynamic Range Imaging with Time-Encoding Spike Camera

Neural Information Processing Systems

As a bio-inspired vision sensor, spike camera records light intensity by accumulating photons and firing a spike once a preset threshold is reached. For high-light regions, the accumulated photons may reach the threshold multiple times within a readout interval, while only one spike can be stored and read out, resulting in incorrect intensity representation and a limited dynamic range. Multi-level (ML) spike camera enhances the dynamic range by introducing a spike-firing counter (SFC) to count spikes within each readout interval for each pixel, and uses different spike symbols to represent the arrival of different amounts of photons. However, when the light intensity becomes even higher, each pixel requires an SFC with a higher bit depth, causing great cost to the manufacturing process. To address these issues, we propose time-encoding (TE) spike camera, which transforms the counting of spikes to recording of the time at which a specific number of spikes (i.e., an overflow) is reached.


Whoop Promo Codes: 20% Off This June 2026

WIRED

Whether you're looking for a Whoop free trial, student discount, or military savings, our guide to Whoop promo codes will help you maximize your membership benefits. Whoop's bracelet-style trackers deliver exhaustive activity tracking and biometric data compared to standard fitness trackers . The Whoop band is also an excellent tool for monitoring sleep and overall health. Our WIRED testers have been reviewing Whoop trackers since the Whoop 3.0, and have watched as the product has evolved into an AI-enabled personalized service . But for most people, the Whoop may feel like an overinvestment, which is why Whoop is particularly popular among elite athletes.


Situat3DChange: Situated 3DChange Understanding Dataset for Multimodal Large Language Model

Neural Information Processing Systems

Physical environments and circumstances are fundamentally dynamic, yet current 3D datasets and evaluation benchmarks tend to concentrate on either dynamic scenarios or dynamic situations in isolation, resulting in incomplete comprehension. To overcome these constraints, we introduce Situat3DChange, an extensive dataset supporting three situation-aware change understanding tasks following the perception-action model: 121K question-answer pairs, 36K change descriptions for perception tasks, and 17K rearrangement instructions for the action task. To construct this large-scale dataset, Situat3DChange leverages 11K human observations of environmental changes to establish shared mental models and shared situational awareness for human-AI collaboration. These observations, enriched with egocentric and allocentric perspectives as well as categorical and coordinate spatial relations, are integrated using an LLM to support understanding of situated changes. To address the challenge of comparing pairs of point clouds from the same scene with minor changes, we propose SCReasoner, an efficient 3DMLLM approach that enables effective point cloud comparison with minimal parameter overhead and no additional tokens required for the language decoder. Comprehensive evaluation on Situat3DChange tasks highlights both the progress and limitations of MLLMs in dynamic scene and situation understanding. Additional experiments on data scaling and cross-domain transfer demonstrate the task-agnostic effectiveness of using Situat3DChange as a training dataset for MLLMs.


Projection-based Lyapunov method for fully heterogeneous weakly-coupled MDPs

Neural Information Processing Systems

Heterogeneity poses a fundamental challenge for many real-world large-scale decision-making problems but remains largely understudied. In this paper, we study the fully heterogeneous setting of a prominent class of such problems, known as weakly-coupled Markov decision processes (WCMDPs). Each WCMDP consists of N arms (or subproblems), which have distinct model parameters in the fully heterogeneous setting, leading to the curse of dimensionality when N is large. We show that, under mild assumptions, an efficiently computable policy achieves an O(1/ N) optimality gap in the long-run average reward per arm for fully heterogeneous WCMDPs as N becomes large. This is the first asymptotic optimality result for fully heterogeneous average-reward WCMDPs. Our main technical innovation is the construction of projection-based Lyapunov functions that certify the convergence of rewards and costs to an optimal region, even under full heterogeneity.1


Ukrainian drone-makers target Asia as Taiwan tensions spur demand

The Japan Times

Ukraine has developed a reputation as a master of drone warfare, which has helped an otherwise-outgunned Kyiv hold out for more ‌than four ‌years against Russia.


ExPO: Unlocking Hard Reasoning with Self-Explanation-Guided Reinforcement Learning

Neural Information Processing Systems

Self-improvement via RL often fails on complex reasoning tasks because GRPOstyle post-training methods rely on the model's initial ability to generate positive samples. Without guided exploration, these approaches merely reinforce what the model already knows (distribution-sharpening) rather than enabling the model to solve problems where it initially generates no correct solutions. To unlock reasoning ability in such settings, the model must explore new reasoning trajectories beyond its current output distribution. Such exploration requires access to sufficiently good positive samples to guide the learning. While expert demonstrations seem like a natural solution, we find that they are often ineffective in RL post-training.