Goto

Collaborating Authors

 Technology


Stop the Nonconsensual Use of Nude Images in Research

Neural Information Processing Systems

In order to train, test, and evaluate nudity detection models, machine learning researchers typically rely on nude images scraped from the Internet. Our research finds that this content is collected and, in some cases, subsequently distributed by researchers without consent, leading to potential misuse and exacerbating harm against the subjects depicted. This position paper argues that the distribution of nonconsensually collected nude images by researchers perpetuates imagebased sexual abuse and that the machine learning community should stop the nonconsensual use of nude images in research. To characterize the scope and nature of this problem, we conducted a systematic review of papers published in computing venues that collect and use nude images. Our results paint a grim reality: norms around the usage of nude images are sparse, leading to a litany of problematic practices like distributing and publishing nude images with uncensored faces, and intentionally collecting and sharing abusive content. We conclude with a call-to-action for publishing venues and a vision for research in nudity detection that balances user agency with concrete research objectives.


Learnable Sampler Distillation for Discrete Diffusion Models

Neural Information Processing Systems

Discrete diffusion models (DDMs) have shown powerful generation ability for discrete data modalities like text and molecules. However, their practical application is hindered by inefficient sampling, requiring a large number of sampling steps. Accelerating DDMs by using larger step sizes typically introduces significant problems in generation quality, as it amplifies the impact of both the compounding decoding error due to factorized predictions and discretization error from numerical approximations, leading to a significant decrease in sampling quality. To address these challenges, we propose learnable sampler distillation (LSD), a novel approach to train fast and high-fidelity samplers for DDMs. LSD employs a distillation approach where a student sampler with a few steps learns to align its intermediate score trajectory with that of a high-quality teacher sampler with numerous steps. This alignment is achieved by optimizing learnable sampler coefficients that adaptively adjust sampling dynamics. Additionally, we further propose LSD+, which also learns time schedules that allocate steps non-uniformly. Experiments across text generation, image generation, and synthetic tasks demonstrate that our proposed approaches outperform existing samplers for DDMs, achieving substantially higher sampling quality with significantly fewer sampling steps. Our code is available at https://github.com/feiyangfu/LSD.


Markov Persuasion Processes: Learning to Persuade From Scratch

Neural Information Processing Systems

In Bayesian persuasion, an informed sender strategically discloses information to a receiver so as to persuade them to undertake desirable actions. Recently, Markov persuasion processes (MPPs) have been introduced to capture sequential scenarios where a sender faces a stream of myopic receivers in a Markovian environment. The MPPs studied so far in the literature suffer from issues that prevent them from being fully operational in practice, e.g., they assume that the sender knows receivers' rewards. We fix such issues by addressing MPPs where the sender has no knowledge about the environment.


Temporal Logic-Based Multi-Vehicle Backdoor Attacks against Offline RLAgents in End-to-end Autonomous Driving

Neural Information Processing Systems

Assessing the safety of autonomous driving (AD) systems against security threats, particularly backdoor attacks, is a stepping stone for real-world deployment. However, existing works mainly focus on pixel-level triggers that are impractical to deploy in the real world. We address this gap by introducing a novel backdoor attack against the end-to-end AD systems that leverage one or more other vehicles' trajectories as triggers. To generate precise trigger trajectories, we first use temporal logic (TL) specifications to define the behaviors of attacker vehicles. Configurable behavior models are then used to generate these trajectories, which are quantitatively evaluated and iteratively refined based on the TL specifications. We further develop a negative training strategy by incorporating patch trajectories that are similar to triggers but are designated not to activate the backdoor. It enhances the stealthiness of the attack and refines the system's responses to trigger scenarios. Through extensive experiments on 5 offline reinforcement learning (RL) driving agents with 6 trigger patterns and target actions combinations, we demonstrate the flexibility and effectiveness of our proposed attack, showing the under-exploration of existing end-to-end AD systems' vulnerabilities to such trajectory-based backdoor attacks. Videos of our attack are available at: tlbackdoor.


Conformal Risk Training: End-to-End Optimization of Conformal Risk Control

Neural Information Processing Systems

While deep learning models often achieve high predictive accuracy, their predictions typically do not come with any provable guarantees on risk or reliability, which are critical for deployment in high-stakes applications. The framework of conformal risk control (CRC) provides a distribution-free, finite-sample method for controlling the expected value of any bounded monotone loss function and can be conveniently applied post-hoc to any pre-trained deep learning model. However, many realworld applications are sensitive to tail risks, as opposed to just expected loss. In this work, we develop a method for controlling the general class of Optimized CertaintyEquivalent (OCE) risks, a broad class of risk measures which includes as special cases the expected loss (generalizing the original CRC method) and common tail risks like the conditional value-at-risk (CVaR).


Collapse and simplex ETF

Neural Information Processing Systems

Neural collapse [26] is an intuitive observation that happens at the terminal phase of a well-trained model on a balanced dataset that last-layer features converge to within-class mean, and all within-class means and their corresponding classifier vectors converge to ETF as shown in Figure 6. The main results can be concluded as follows: (NC1) Variability of the last-layer features Σ:= Avgi,c{(hic hc)(hic hc)T} collapse within-class: Σ 0, where hic is the last-layer feature of the i-th sample in the c-th class, and hc is the within-class mean of c-th class's features. Last-layer features converge to within-class mean, and all within-class means and their corresponding classifier vectors converge to a simplex ETF. To analyze this phenomenon, some studies simplify deep neural networks as last-layer features and classifier (layer-peeled model)[9, 12, 40, 53] with proper constraints or regularizations. In the view of layer-peeled model (LPM), training W with constraints on the weights can be seen as training the C-class classification head WL = {W1,...,WC} and features H = {h1,...,hN} of all n samples output by last layer of backbone with constraints EW and EH respectively. EH. (6) In the balanced dataset, as described in Lemma 1, any solutions to this model merge neural collapse and form a simplex equiangular tight frame (ETF), which means ETF is optimal classifier in the balanced case of LPM.


Cross City Traffic Flow Generation via Retrieval Augmented Diffusion Model

Neural Information Processing Systems

Traffic flow data are of great value in smart city applications. However, limited by data collection costs and privacy sensitivity, it is rather difficult to obtain large-scale traffic flow data. Therefore, various data generation methods have been proposed in the literature. Nevertheless, these methods often require data from a specific city for training and are difficult to directly apply to new cities lacking data. To address this problem, this paper proposes a retrieval-augmented diffusion generation model with geographic representation alignment. We use data from multiple source cities for training, extract consistent representations across multiple cities, and leverage retrieval-augmented generation (RAG) technology to incorporate dynamic traffic flow patterns into the condition, aiming to improve the accuracy of data generation in the target city. Experiments on four real-world datasets demonstrate that, compared to existing generation methods, our method achieves best cross-city zero-shot performance.


Bit-swapping Oriented Twin-memory Multi-view Clustering in Lifelong Incomplete Scenarios

Neural Information Processing Systems

Although receiving notable improvements, current multi-view clustering (MVC) techniques generally rely on feature library mechanisms to propagate accumulated knowledge from historical views to newly-arrived data, which overlooks the information pertaining to basis embedding within each view. Moreover, the mapping paradigm inevitably alters the values of learned landmarks and built affinities due to the uninterruption nature, accordingly disarraying the hierarchical cluster structures. To mitigate these two issues, we in the paper provide a named BSTM algorithm. Concretely, we firstly synchronize with the distinct dimensions by introducing a group of specialized projectors, and then establish unified anchors for all views collected so far to capture intrinsic patterns. Afterwards, departing from per-view architectures, we devise a shared bipartite graph construction via indicators to quantify similarity, which not only avoids redundant data-recalculations but alleviates the representation distortion caused by fusion.


Fast Projection-Free Approach (without Optimization Oracle) for Optimization over Compact Convex Set

Neural Information Processing Systems

Projection-free first-order methods, e.g., the celebrated Frank-Wolfe (FW) algorithms, have emerged as powerful tools for optimization over simple convex sets such as polyhedra, because of their scalability, fast convergence, and iteration-wise feasibility without costly projections. However, extending these methods effectively to general compact convex sets remains challenging and largely open, as FW methods rely on expensive linear optimization oracles (LOO), while penalty-based methods often struggle with poor feasibility. We tackle this open challenge by presenting Hom-PGD, a novel projection-free method without expensive (optimization) oracles. Our method constructs a homeomorphism between the convex constraint set and a unit ball, transforming the original problem into an equivalent ball-constrained formulation, thus enabling efficient gradient-based optimization while preserving the original problem structure. We prove that Hom-PGD attains optimal convergence rates matching gradient descent with constant step-size to find an ϵ-approximate (stationary) solution: O(log(1/ϵ))for strongly convex objectives, O(ϵ 1) for convex objectives, and O(ϵ 2) for non-convex objectives. Meanwhile, Hom-PGD enjoys a low per-iteration complexity of O(n2), without expensive oracles like LOO or projection, where nis the input size. Our framework further extends to certain non-convex sets, broadening its applicability in practical optimization scenarios with complex constraints. Extensive numerical experiments demonstrate that Hom-PGD achieves comparable convergence rates to state-of-theart projection-free methods, while significantly reducing per-iteration runtime (up to 5 orders of magnitude faster) and thus the total problem-solving time.


Dual-Flow: Transferable Multi-Target, Instance-Agnostic Attacks via In-the-wild Cascading Flow Optimization

Neural Information Processing Systems

Adversarial attacks are widely used to evaluate model robustness, and in black-box scenarios, the transferability of these attacks becomes crucial. Existing generatorbased attacks have excellent generalization and transferability due to their instanceagnostic nature. However, when training generators for multi-target tasks, the success rate of transfer attacks is relatively low due to the limitations of the model's capacity. To address these challenges, we propose a novel Dual-Flow framework for multi-target instance-agnostic adversarial attacks, utilizing Cascading Distribution Shift Training to develop an adversarial velocity function. Extensive experiments demonstrate that Dual-Flow significantly improves transferability over previous multi-target generative attacks. For example, it increases the success rate from Inception-v3 to ResNet-152 by 34.58%. Furthermore, our attack method shows substantially stronger robustness against defense mechanisms, such as adversarially trained models. The code of Dual-Flow is available at: https://github.com/Chyxx/Dual-Flow.