Genre
Vision Foundation Models as Effective Visual for Generation
In this work, we present a novel direction to build an image tokenizer directly on top of a frozen vision foundation model, which is a largely underexplored area. Specifically, we employ a frozen vision foundation model as the encoder of our tokenizer. To enhance its effectiveness, we introduce two key components: (1) a region-adaptive quantization framework that reduces redundancy in the pre-trained features on regular 2D grids, and (2) a semantic reconstruction objective that aligns the tokenizer's outputs with the foundation model's representations to preserve semantic fidelity. Based on these designs, our proposed image tokenizer, VFMTok, achieves substantial improvements in image reconstruction and generation quality, while also enhancing token efficiency. It further boosts autoregressive (AR) generation--achieving a gFID of 1.36 on ImageNet benchmarks, while accelerating model convergence by three times, and enabling high-fidelity class-conditional synthesis without the need for classifier-free guidance (CFG). The code is available at https://github.com/CVMI-Lab/VFMTok.
FADRM: Fast and Accurate Data Residual Matching for Dataset Distillation
Residual connection has been extensively studied and widely applied at the model architecture level. However, its potential in the more challenging data-centric approaches remains unexplored. In this work, we introduce the concept of Data Residual Matching for the first time, leveraging data-level skip connections to facilitate data generation and mitigate data information vanishing. This approach maintains a balance between newly acquired knowledge through pixel space optimization and existing core local information identification within raw data modalities, specifically for the dataset distillation task. Furthermore, by incorporating training-time refinements, our method significantly improves computational efficiency, achieving superior performance while reducing training time and peak GPU memory usage by 50%. Consequently, the proposed method Fast and Accurate Data Residual Matching for Dataset Distillation (FADRM) establishes a new stateof-the-art, demonstrating substantial improvements over existing methods across multiple dataset benchmarks in both efficiency and effectiveness. For instance, with ResNet-18 as the student model and a 0.8% compression ratio on ImageNet-1K, the method achieves 48.4% test accuracy in single-model dataset distillation and 50.9% in multi-model dataset distillation, surpassing RDED by +6.4% and outperforming
EquiTabPFN: ATarget-Permutation Equivariant Prior Fitted Network
Recent foundational models for tabular data, such as TabPFN, excel at adapting to new tasks via in-context learning, but remain constrained to a fixed, pre-defined number of target dimensions--often necessitating costly ensembling strategies. We trace this constraint to a deeper architectural shortcoming: these models lack target equivariance, so that permuting target dimension orderings alters their predictions. This deficiency gives rise to an irreducible "equivariance gap," an error term that introduces instability in predictions. We eliminate this gap by designing a fully target-equivariant architecture--ensuring permutation invariance via equivariant encoders, decoders, and a bi-attention mechanism. Empirical evaluation on standard classification benchmarks shows that, on datasets with more classes than those seen during pre-training, our model matches or surpasses existing methods while incurring lower computational overhead.
Millions told to prepare NOW as Tropical Storm Warning is issued along US coast: 'Arthur is coming'
'Ringleader' of alleged UFC drone attack to kill Trump is unmasked as illegal migrant who was granted DACA stay under Obama Spy world panic as Tulsi Gabbard prepares to unleash bombshell file dumps on secret CIA'mind control' project and Dr. Fauci Watch horrifying drone video that follows woman's plunge to death after bungee team threw her from bridge without rope Horrific new videos blow Texas woman's mystery death wide open: Her agonizing'final gasp'... unthinkably vile corpse claims... and sick past of man who saw her last Taylor Swift's bottomless thirst for attention, her greed and sheer tackiness are now truly unbearable... this latest stunt has shown her true colors: MAUREEN CALLAHAN Kanye West's wife Bianca Censori raises eyebrows in plunging white lace lingerie as she photographs a nude model at Art Basel in Switzerland Trump says'fools who think I haven't been tough enough on Iran' are'jealous or stupid' after signing widely-criticised deal that includes giving Tehran $300billion Father keeps his cool as shouting man calls cops on him for taking his two young daughters into women's restroom All my friends are suddenly getting divorced. Mid-life wives share taboo sex confessions about why they really leave... including common position that made one hate her husband: JANA HOCKING Sensational REAL reason Jelly Roll is divorcing Bunnie XO: Insiders reveal'preacher's wife' bombshell that's the talk of Nashville... truth about legendary rocker cuckolding rumor... and G-string mishap Brooklyn Beckham is savaged by fans for yet another'classless' swipe at his estranged family as new DoorDash ad is branded a'giant PR mess' LIZ JONES: The cracks in Harry and Meghan's perfect facade have started to show. It's so obvious he's tiring of her tone-deaf approach... and I predict there's serious trouble in store Every emotional moment from the Gilgo Beach killer's sentencing: Rex Heuermann's shocking first words... and the chilling exchange that silenced the room Tropical Storm Arthur has formed in the Gulf, becoming the first named storm of the 2026 Atlantic hurricane season. The National Hurricane Center (NHC) announced Wednesday morning that Arthur had strengthened into a tropical storm with maximum sustained winds of 40mph. The storm was located about 40 miles northeast of Port O'Connor, Texas, and about 190 miles west-southwest of Lake Charles, Louisiana .
Normal-Abnormal Guided Generalist Anomaly Detection
Generalist Anomaly Detection (GAD) aims to train a unified model on an original domain that can detect anomalies in new target domains. Previous GAD methods primarily use only normal samples as references, overlooking the valuable information contained in anomalous samples that are often available in real-world scenarios. To address this limitation, we propose a more practical approach: normal-abnormalguided generalist anomaly detection, which leverages both normal and anomalous samples as references to guide anomaly detection across diverse domains. We introduce the Normal-Abnormal Generalist Learning (NAGL) framework, consisting of two key components: Residual Mining (RM) and Anomaly Feature Learning (AFL). RM extracts abnormal patterns from normal-abnormal reference residuals to establish transferable anomaly representations, while AFL adaptively learns anomaly features in query images through residual mapping to identify instance-aware anomalies.
Non-Stationary Structural Causal Bandits
We study the problem of sequential decision-making in environments governed by evolving causal mechanisms. Prior work on structural causal bandits--formulations that integrate causal graphs into multi-armed bandit problems to guide intervention selection--has shown that leveraging the causal structure can reduce unnecessary interventions by identifying possibly-optimal minimal intervention sets (POMISs). However, such formulations fall short in dynamic settings where reward distributions may vary over time, due to their static--and thus myopic--nature focuses on immediate rewards and overlooks the long-term effects of interventions. In this work, we propose a non-stationary structural causal bandit framework that leverages temporal structural causal models to capture evolving dynamics over time. We characterize how interventions propagate over time by developing graphical tools and assumptions, which form the basis for identifying non-myopic intervention strategies. Within this framework, we devise POMIS+, which captures the existence of variables that contribute to maximizing both immediate and long-term rewards. Our framework provides a principled way to reason about temporally-aware interventions by explicitly modeling information propagation across time. Empirical results validate the effectiveness of our approach, demonstrating improved performance over myopic baselines.
Logic.py: Bridging the Gap between LLMs and Constraint Solvers
We present a novel approach to formalise and solve search-based problems using large language models, which significantly improves upon previous state-of-theart results. We demonstrate the efficacy of this approach on benchmarks like the logic puzzles tasks in ZebraLogicBench. Instead of letting the LLM attempt to directly solve the puzzles, our method prompts the model to formalise the problem in a logic-focused, human-readable, domain-specific language (DSL) called Logic.py. This formalised representation is then solved using a constraint solver, leveraging the strengths of both the language model and the solver. Our approach achieves a remarkable 65% absolute improvement over the baseline performance of Llama 3.1 70B on ZebraLogicBench, increasing its accuracy to over 90%. This significant advancement demonstrates the potential of combining language models with domain-specific languages and auxiliary tools on traditionally challenging tasks for LLMs.
Thousand Voices of Trauma: ALarge-Scale Synthetic Dataset for Modeling Prolonged Exposure Therapy Conversations
The advancement of AI systems for mental health support is hindered by limited access to therapeutic conversation data, particularly for trauma treatment. We present Thousand Voices of Trauma, a synthetic benchmark dataset of 3,000 therapy conversations based on Prolonged Exposure therapy protocols for Post-traumatic Stress Disorder (PTSD). The dataset comprises 500 unique cases, each explored through six conversational perspectives that mirror the progression of therapy from initial anxiety to peak distress to emotional processing. We incorporated diverse demographic profiles (ages 18-80, M=49.3, 49.4% male, 44.4% female, 6.2% nonbinary), 20 trauma types, and 10 trauma-related behaviors using deterministic and probabilistic generation methods. Analysis reveals realistic distributions of trauma types (witnessing violence 10.6%, bullying 10.2%) and symptoms (nightmares 23.4%, substance abuse 20.8%).
KungfuBot: Physics-Based Humanoid Whole-Body Control for Learning Highly-Dynamic Skills
Humanoid robots are promising to acquire various skills by imitating human behaviors. However, existing algorithms are only capable of tracking smooth, low-speed human motions, even with delicate reward and curriculum design. This paper presents a physics-based humanoid control framework, aiming to master highly-dynamic human behaviors such as Kungfu and dancing through multisteps motion processing and adaptive motion tracking. For motion processing, we design a pipeline to extract, filter out, correct, and retarget motions, while ensuring compliance with physical constraints to the maximum extent. For motion imitation, we formulate a bi-level optimization problem to dynamically adjust the tracking accuracy tolerance based on the current tracking error, creating an adaptive curriculum mechanism. We further construct an asymmetric actor-critic framework for policy training. In experiments, we train whole-body control policies to imitate a set of highly-dynamic motions. Our method achieves significantly lower tracking errors than existing approaches and is successfully deployed on the Unitree G1 robot, demonstrating stable and expressive behaviors. The project page is https://kungfu-bot.github.io.