Goto

Collaborating Authors

 right singular vector


What Makes and Breaks Safety Fine tuning A Mechanistic Study

Neural Information Processing Systems

Safety fine-tuning helps align Large Language Models (LLMs) with human preferences for their safe deployment. To better understand the underlying factors that make models safe via safety fine-tuning, we design a synthetic data generation framework that captures salient aspects of an unsafe input by modeling the interaction between the task the model is asked to perform (e.g., "design") versus the specific concepts the task is asked to be performed upon (e.g., a "cycle" vs. a "bomb").




Don't take it lightly: Phasing optical random projections with unknown operators

Sidharth Gupta, Remi Gribonval, Laurent Daudet, Ivan Dokmanić

Neural Information Processing Systems

In this paper we tackle the problem of recovering the phase of complex linear measurements whenonlymagnitude information isavailableandwecontrol the input. We are motivated by the recent development of dedicated optics-based hardware for rapid random projections which leverages the propagation of light inrandom media.




High-Dimensional Partial Least Squares: Spectral Analysis and Fundamental Limitations

Léger, Victor, Chatelain, Florent

arXiv.org Machine Learning

Partial Least Squares (PLS) is a widely used method for data integration, designed to extract latent components shared across paired high-dimensional datasets. Despite decades of practical success, a precise theoretical understanding of its behavior in high-dimensional regimes remains limited. In this paper, we study a data integration model in which two high-dimensional data matrices share a low-rank common latent structure while also containing individual-specific components. We analyze the singular vectors of the associated cross-covariance matrix using tools from random matrix theory and derive asymptotic characterizations of the alignment between estimated and true latent directions. These results provide a quantitative explanation of the reconstruction performance of the PLS variant based on Singular Value Decomposition (PLS-SVD) and identify regimes where the method exhibits counter-intuitive or limiting behavior. Building on this analysis, we compare PLS-SVD with principal component analysis applied separately to each dataset and show its asymptotic superiority in detecting the common latent subspace. Overall, our results offer a comprehensive theoretical understanding of high-dimensional PLS-SVD, clarifying both its advantages and fundamental limitations.



Catch-Only-One: Non-Transferable Examples for Model-Specific Authorization

Wang, Zihan, Ma, Zhiyong, Ma, Zhongkui, Liu, Shuofeng, Liu, Akide, Wang, Derui, Xue, Minhui, Bai, Guangdong

arXiv.org Artificial Intelligence

Recent AI regulations call for data that remain useful for innovation while resistant to misuse, balancing utility with protection at the model level. Existing approaches either perturb data to make it unlearnable or retrain models to suppress transfer, but neither governs inference by unknown models, and both typically require control over training. We propose non-transferable examples (NEs), a training-free and data-agnostic input-side usage-control mechanism. We recode inputs within a model-specific low-sensitivity subspace, preserving outputs for the authorized model while reducing performance on unauthorized models through subspace misalignment. We establish formal bounds that guarantee utility for the authorized model and quantify deviation for unauthorized ones, with the Hoffman-Wielandt inequality linking degradation to spectral differences. Empirically, NEs retain performance on diverse vision backbones and state-of-the-art vision-language models under common preprocessing, whereas non-target models collapse even with reconstruction attempts. These results establish NEs as a practical means to preserve intended data utility while preventing unauthorized exploitation. Our project is available at https://trusted-system-lab.github.io/model-specificity