Technology
DGCBench: ADeep Graph Clustering Benchmark
Deep graph clustering (DGC) aims to partition graph nodes into distinct clusters in an unsupervised manner. Despite rapid advancements in this field, DGC remains inherently challenging due to the absence of ground-truth, which complicates the design of effective algorithms and impedes the establishment of standardized benchmarks. The lack of unified datasets, evaluation protocols, and metrics further exacerbates these challenges, making it difficult to systematically assess and compare DGC methods. To address these limitations, we introduce DGCBench, the first comprehensive and unified benchmark for DGC methods. It evaluates 12 state-ofthe-art DGC methods across 12 datasets from diverse domains and scales, spanning 6 critical dimensions: discriminability, effectiveness, scalability, efficiency, stability, and robustness. Additionally, we develop PyDGC, an open-source Python library that standardizes the DGC training and evaluation paradigm. Through systematic experiments, we reveal persistent limitations in existing methods, specifically regarding the homophily bottleneck, training instability, vulnerability to perturbations, efficiency plateau, scalability challenges, and poor discriminability, thereby offering actionable insights for future research. We hope that DGCBench, PyDGC, and our analyses will collectively accelerate the progress in the DGC community.
Signaland Noise: AFramework for Reducing Uncertainty in Language Model Evaluation
Developing large language models is expensive and involves making decisions with small experiments, typically by evaluating on large, multi-task evaluation suites. In this work, we analyze specific properties which make a benchmark more reliable for such decisions, and interventions to design higher-quality evaluation benchmarks. We introduce two key metrics that show differences in current benchmarks: signal, a benchmark's ability to separate better models from worse models, and noise, a benchmark's sensitivity to random variability between training steps. We demonstrate that benchmarks with a better signal-to-noiseratio are more reliable when making decisions at small scale, and those with less noisehave lower scaling law prediction error. These results suggest that improving signal or noise will lead to more useful benchmarks, so we introduce three interventions designed to directly affect signal or noise.
Grasp2Grasp: Vision-Based Dexterous Grasp Translation via Schrรถdinger Bridges
We propose a new approach to vision-based dexterous grasp translation, which aims to transfer grasp intent across robotic hands with differing morphologies. Given a visual observation of a source hand grasping an object, our goal is to synthesize a functionally equivalent grasp for a target hand without requiring paired demonstrations or hand-specific simulations.
UMU-Bench: Closing the Modality Gap in Multimodal Unlearning Evaluation
Although Multimodal Large Language Models (MLLMs) have advanced numerous fields, their training on extensive multimodal datasets introduces significant privacy concerns, prompting the necessity for effective unlearning methods. However, current multimodal unlearning approaches often directly adapt techniques from unimodal contexts, largely overlooking the critical issue of modality alignment, i.e., consistently removing knowledge across both unimodal and multimodal settings. To close this gap, we introduce UMU-Bench, a unified benchmark specifically targeting modality misalignment in multimodal unlearning. UMU-Benchconsists of a meticulously curated dataset featuring 653 individual profiles, each described with both unimodal and multimodal knowledge. Additionally, novel tasks and evaluation metrics focusing on modality alignment are introduced, facilitating a comprehensive analysis of unimodal and multimodal unlearning effectiveness. Through extensive experimentation with state-of-the-art unlearning algorithms on UMU-Bench, we demonstrate prevalent modality misalignment issues in existing methods. These findings underscore the critical need for novel multimodal unlearning approaches explicitly considering modality alignment.
System-Embedded Diffusion Bridge Models
Solving inverse problems--recovering signals from incomplete or noisy measurements--is fundamental in science and engineering. Score-based generative models (SGMs) have recently emerged as a powerful framework for this task. Two main paradigms have formed: unsupervised approaches that adapt pretrained generative models to inverse problems, and supervised bridge methods that train stochastic processes conditioned on paired clean and corrupted data. While the former typically assume knowledge of the measurement model, the latter have largely overlooked this structural information. We introduce System-embedded Diffusion Bridge Models (SDBs), a new class of supervised bridge methods that explicitly embed the known linear measurement system into the coefficients of a matrix-valued SDE. This principled integration yields consistent improvements across diverse linear inverse problems and demonstrates robust generalization under system misspecification between training and deployment, offering a promising solution to real-world applications.
Variational Regularized Unbalanced Optimal Transport: Single Network, Least Action
Recovering the dynamics from a few snapshots of a high-dimensional system is a challenging task in statistical physics and machine learning, with important applications in computational biology. Many algorithms have been developed to tackle this problem, based on frameworks such as optimal transport and the Schrรถdinger bridge. A notable recent framework is Regularized Unbalanced Optimal Transport (RUOT), which integrates both stochastic dynamics and unnormalized distributions. However, since many existing methods do not explicitly enforce optimality conditions, their solutions often struggle to satisfy the principle of least action and meet challenges to converge in a stable and reliable way. To address these issues, we propose Variational RUOT (Var-RUOT), a new framework to solve the RUOT problem. By incorporating the optimal necessary conditions for the RUOT problem into both the parameterization of the search space and the loss function design, Var-RUOT only needs to learn a scalar field to solve the RUOT problem and can search for solutions with lower action. We also examined the challenge of selecting a growth penalty function in the widely used Wasserstein-Fisher-Rao metric and proposed a solution that better aligns with biological priors in Var-RUOT.
'Looked so real': How AI is being weaponised against India's Muslim women
'Looked so real': How AI is being weaponised against India's Muslim women The freelance model from India-administered Kashmir was scrolling on her phone last year when a friend sent her a clip circulating on Instagram. But it was entirely fabricated. "It was proper stalking," Ayoub, 24, said. "They had followed my life from my first semester to the last at the university." The video stitched together photographs from Ayoub's time as a student at New Delhi's Jamia Millia Islamia University - images drawn from everyday moments of campus life, including group projects, farewell gatherings and selfies with classmates.
SpecMER: Fast Protein Generation with K-mer Guided Speculative Decoding
Autoregressive models have transformed protein engineering by enabling the generation of novel protein sequences beyond those found in nature. However, their sequential inference introduces significant latency, limiting their utility in highthroughput protein screening. Speculative decoding accelerates generation by employing a lightweight draft model to sample tokens, which a larger target model then verifies and refines. Yet, in protein sequence generation, draft models are typically agnostic to the structural and functional constraints of the target protein, leading to biologically implausible outputs and a shift in the likelihood distribution of generated sequences. We introduce SpecMER (Speculative Decoding via k-mer Guidance), a novel framework that incorporates biological, structural, and functional priors using k-mer motifs extracted from multiple sequence alignments. By scoring candidate sequences in parallel and selecting those most consistent with known biological patterns, SpecMER significantly improves sequence plausibility while retaining the efficiency of speculative decoding. SpecMER achieves 24-32% speedup over standard autoregressive decoding, along with higher acceptance rates and improved sequence likelihoods.
BUNDLEFLOW: Deep Menus for Combinatorial Auctions by Diffusion-Based Optimization
Differentiable economics--the use of deep learning for auction design--has driven progress in multi-item auction design with additive and unit-demand valuations. However, there has been little progress for combinatorial auctions (CAs), even in the simplest and yet important single bidder case, due to exponential growth of the bundle space with the number of items. We address this challenge by introducing a deep network architecture for a menu-based CA, which supports the first dominantstrategy incentive compatible (DSIC), revenue-optimizing single-bidder CA. Our idea is to generate a bundle distribution through an ordinary differential equation (ODE) applied to a tractable initial distribution. Our method, BUNDLEFLOW, learns suitable ODE-based transforms, one for each menu element, to optimize expected revenue. BUNDLEFLOW achieves up to 2.23 higher revenue than baselines on standard CA testbeds and scales up to 500 items.
Dense Attention Latency: 1649s Radial Attention (Ours) Latency: 876s (1.9 Faster) PSNR: 27.3 (a) 117 Frames (Default Length)
Recent advances in diffusion models have enabled high-quality video generation, b making ut the additional training and temporal inference dimension on long significantly videos prohibiti increases vely computational expensive. In costs, this paper diffusion, we models: identify post-softmax a phenomenon attention we term scores Spatiotempor diminish al as Ener spatial gy Decay and temporal in video distance o scalable ver space sparse between and time attention tok in ens nature.