Goto

Collaborating Authors

 validate


Rethinking the Backward Propagation for Adversarial Transferability

Neural Information Processing Systems

Transfer-based attacks generate adversarial examples on the surrogate model, which can mislead other black-box models without access, making it promising to attack real-world applications. Recently, several works have been proposed to boost adversarial transferability, in which the surrogate model is usually overlooked. In this work, we identify that non-linear layers (e.g.


Tight Sample Complexity Bounds for Best-Arm Identification Under Bounded Systematic Bias

arXiv.org Machine Learning

As search depth increases in autonomous reasoning and embodied planning, the candidate action space expands exponentially, heavily taxing computational budgets. While heuristic pruning is a common countermeasure, it operates without formal safety guarantees when surrogate models (like LLMs) exhibit systematic evaluation biases. This paper frames the node expansion process as a localized Best-Arm Identification (BAI) problem over dynamic frontiers, subject to a bounded systematic bias $L$. By inverting the Lambert W function, we establish an additive sample complexity of $\mathcal{O}((ฮ”-4L)^{-2})$, which indicates that safe node elimination is only feasible when the empirical reward gap exceeds $4L$. We complement this with an information-theoretic lower bound of $ฮฉ((ฮ”-2L)^{-2})$ to confirm the structural limits of biased search. Subsequent evaluations on both synthetic trees and complex reasoning tasks demonstrate that adhering to this local safety boundary successfully preserves optimal trajectories while maximizing sample allocation efficiency.


SelfCodeAlign: Self-Alignment for Code Generation

Neural Information Processing Systems

Instruction tuning is a supervised fine-tuning approach that significantly improves the ability of large language models (LLMs) to follow human instructions. For programming tasks, most models are finetuned with costly human-annotated instruction-response pairs or those generated by large, proprietary LLMs, which may not be permitted. We propose SelfCodeAlign, the first fully transparent and permissive pipeline for self-aligning code LLMs without extensive human annotations or distillation. SelfCodeAlign employs the same base model for inference throughout the data generation process. It first extracts diverse coding concepts from high-quality seed snippets to generate new tasks.







PosteriorRefinementImprovesSampleEfficiency inBayesianNeuralNetworks

Neural Information Processing Systems

Its derivation, based on Lu et al.[54] is as follows. For the HMC baseline, we use the default implementation of NUTS in Pyro. In Table 7, we present the detailed, non-averaged results to complement Table 4. In both cases, we observe that the performance of the refined posterior approaches HMC's. C.2 Textclassification We further validate the proposed method on text classification problems.