Goto

Collaborating Authors

 Genre


AVROBUSTBENCH: Benchmarking the Robustness of Audio-Visual Recognition Models at Test-Time Sarthak Kumar Maharana Saksham Singh Kushwaha Baoming Zhang Adrian Rodriguez Songtao Wei Yapeng Tian

Neural Information Processing Systems

AVROBUSTBENCH comprises four audio-visual benchmark datasets, AUDIOSET-2C, VGGSOUND-2C, KINETICS-2C, and EPICKITCHENS-2C, each incorporating 75 bimodal audio-visual corruptions that are co-occurring and correlated. Through extensive evaluations, we observe that state-of-the-art supervised and severity self-supervised increases.


Differential Privacy for Euclidean Jordan Algebra with Applications to Private Symmetric Cone Programming

Neural Information Processing Systems

In this paper, we study differentially private mechanisms for functions whose outputs lie in a Euclidean Jordan algebra. Euclidean Jordan algebras capture many important mathematical structures and form the foundation of linear programming, second-order cone programming, and semidefinite programming. Our main contribution is a generic Gaussian mechanism for such functions, with sensitivity measured in โ„“2, โ„“1, and โ„“ norms. Notably, this framework includes the important case where the function outputs are symmetric matrices, and sensitivity is measured in the Frobenius, nuclear, or spectral norm. We further derive private algorithms for solving symmetric cone programs under various settings, using a combination of the multiplicative weights update method and our generic Gaussian mechanism. As an application, we present differentially private algorithms for semidefinite programming, resolving a major open question posed by [Hsu, Roth, Roughgarden, and Ullman, ICALP 2014].


ComfyMind: Toward General-Purpose Generation via Tree-Based Planning and Reactive Feedback

Neural Information Processing Systems

With the rapid advancement of generative models, general-purpose generation has gained increasing attention as a promising approach to unify diverse tasks across modalities within a single system. Despite this progress, existing opensource frameworks often remain fragile and struggle to support complex real-world applications due to the lack of structured workflow planning and execution-level feedback. To address these limitations, we present ComfyMind, a collaborative AI system designed to enable robust and scalable general-purpose generation, built on the ComfyUI platform.


OS-HARM: ABenchmark for Measuring Safety of Computer Use Agents

Neural Information Processing Systems

Computer use agents are LLM-based agents that can directly interact with a graphical user interface, by processing screenshots or accessibility trees. While these systems are gaining popularity, their safety has been largely overlooked, despite the fact that evaluating and understanding their potential for harmful behavior is essential for widespread adoption. To address this gap, we introduce OS-HARM, a new benchmark for measuring safety of computer use agents. OS-HARM is built on top of the OSWorld environment (Xie et al., 2024) and aims to test models across three categories of harm: deliberate user misuse, prompt injection attacks, and model misbehavior.


PointMAC: Meta-Learned Adaptation for Robust Test-Time Point Cloud Completion

Neural Information Processing Systems

Point cloud completion is essential for robust 3D perception in safety-critical applications such as robotics and augmented reality. However, existing models perform static inference and rely heavily on inductive biases learned during training, limiting their ability to adapt to novel structural patterns and sensor-induced distortions at test time. To address this limitation, we propose PointMAC, a meta-learned framework for robust test-time adaptation in point cloud completion. It enables sample-specific refinement without requiring additional supervision. Our method optimizes the completion model under two self-supervised auxiliary objectives that simulate structural and sensor-level incompleteness.


Understanding Generalization in Physics Informed Models through Affine Variety Dimensions

Neural Information Processing Systems

Physics-informed machine learning is gaining significant traction for enhancing statistical performance and sample efficiency through the integration of physical knowledge. However, current theoretical analyses often presume complete prior knowledge in non-hybrid settings, overlooking the crucial integration of observational data, and are frequently limited to linear systems, unlike the prevalent nonlinear nature of many real-world applications. To address these limitations, we introduce a unified residual form that unifies collocation and variational methods, enabling the incorporation of incomplete and complex physical constraints in hybrid learning settings. Within this formulation, we establish that the generalization performance of physics-informed regression in such hybrid settings is governed by the dimension of the affine variety associated with the physical constraint, rather than by the number of parameters. This enables a unified analysis that is applicable to both linear and nonlinear equations. We also present a method to approximate this dimension and provide experimental validation of our theoretical findings.


Fixed-Point RNNs: Interpolating from Diagonal to Dense

Neural Information Processing Systems

Linear recurrent neural networks (RNNs) and state-space models (SSMs) such as Mamba have become promising alternatives to softmax-attention as sequence mixing layers in Transformer architectures. Current models, however, do not exhibit the full state-tracking expressivity of RNNs because they rely on channel-wise (i.e.


RepGuard: Adaptive Feature Decoupling for Robust Backdoor Defense in Large Language Models

Neural Information Processing Systems

Backdoor attacks pose a significant threat to large language models (LLMs) by embedding malicious triggers that manipulate model behavior. However, existing defenses primarily rely on prior knowledge of backdoor triggers or targets and offer only superficial mitigation strategies, thus struggling to fundamentally address the inherent reliance on unreliable features. To address these limitations, we propose a novel defense strategy, RepGuard, that strengthens LLM resilience by adaptively separating abnormal features from useful semantic representations, rendering the defense agnostic to specific trigger patterns. Specifically, we first introduce a dual-perspective feature localization strategy that integrates local consistency and sample-wise deviation metrics to identify suspicious backdoor patterns. Based on this identification, an adaptive mask generation mechanism is applied to isolate backdoor-targeted shortcut features by decomposing hidden representations into independent spaces, while preserving task-relevant semantics.


Resolution of Simpson's paradox via the common cause principle

Neural Information Processing Systems

Simpson's paradox poses a challenge in probabilistic inference and decisionmaking. Our study revisits the paradox by re-estimating its frequency with an unbiased data generation process and reaffirms that it is not an artifact of deficient data collection. Thus, it can lead to incorrect recommendations in fields as diverse as statistics, psychology, and artificial intelligence. We show that the paradox can be resolved by assuming a minimal -- though not necessarily observed -- common cause (or screening) variable for the involved random variables. In our approach, conditioning on this minimal common cause establishes the correct association between events, which coincides with the conditioning (i.e., fine-grained) option of the original Simpson paradox. This resolution applies to both discrete cases of binary variables and continuous settings modeled by Gaussian variables. For a non-minimal common cause, the resolution of the paradox is possible, but detailed knowledge of the common cause is required. Our findings extend traditional understandings of the paradox and offer practical guidance for resolving apparent contradictions in probabilistic inference, ultimately enhancing decision-making processes. This point is illustrated by several examples.