Goto

Collaborating Authors

 Optimization


DREAM: Scalable Red Teaming for Text-to-Image Generative Systems via Distribution Modeling

arXiv.org Artificial Intelligence

Despite the integration of safety alignment and external filters, text-to-image (T2I) generative systems are still susceptible to producing harmful content, such as sexual or violent imagery. This raises serious concerns about unintended exposure and potential misuse. Red teaming, which aims to proactively identify diverse prompts that can elicit unsafe outputs from the T2I system, is increasingly recognized as an essential method for assessing and improving safety before real-world deployment. However, existing automated red teaming approaches often treat prompt discovery as an isolated, prompt-level optimization task, which limits their scalability, diversity, and overall effectiveness. To bridge this gap, in this paper, we propose DREAM, a scalable red teaming framework to automatically uncover diverse problematic prompts from a given T2I system. Unlike prior work that optimizes prompts individually, DREAM directly models the probabilistic distribution of the target system's problematic prompts, which enables explicit optimization over both effectiveness and diversity, and allows efficient large-scale sampling after training. To achieve this without direct access to representative training samples, we draw inspiration from energy-based models and reformulate the objective into a simple and tractable form. We further introduce GC-SPSA, an efficient optimization algorithm that provides stable gradient estimates through the long and potentially non-differentiable T2I pipeline. During inference, we also propose a diversity-aware sampling strategy to enhance prompt variety. The effectiveness of DREAM is validated through extensive experiments, demonstrating state-of-the-art performance across a wide range of T2I models and safety filters in terms of both prompt success rate and diversity. Our code is available at https://github.com/AntigoneRandy/DREAM


Generalized Probabilistic Approximate Optimization Algorithm

arXiv.org Artificial Intelligence

We introduce a generalized \textit{Probabilistic Approximate Optimization Algorithm (PAOA)}, a classical variational Monte Carlo framework that extends and formalizes prior work by Weitz \textit{et al.}~\cite{Combes_2023}, enabling parameterized and fast sampling on present-day Ising machines and probabilistic computers. PAOA operates by iteratively modifying the couplings of a network of binary stochastic units, guided by cost evaluations from independent samples. We establish a direct correspondence between derivative-free updates and the gradient of the full Markov flow over the exponentially large state space, showing that PAOA admits a principled variational formulation. Simulated annealing emerges as a limiting case under constrained parameterizations, and we implement this regime on an FPGA-based probabilistic computer with on-chip annealing to solve large 3D spin-glass problems. Benchmarking PAOA against QAOA on the canonical 26-spin Sherrington-Kirkpatrick model with matched parameters reveals superior performance for PAOA. We show that PAOA naturally extends simulated annealing by optimizing multiple temperature profiles, leading to improved performance over SA on heavy-tailed problems such as SK-Lévy.


Stepsize anything: A unified learning rate schedule for budgeted-iteration training

arXiv.org Artificial Intelligence

The expanding computational costs and limited resources underscore the critical need for budgeted-iteration training, which aims to achieve optimal learning within predetermined iteration budgets. While learning rate schedules fundamentally govern the performance of different networks and tasks, particularly in budgeted-iteration scenarios, their design remains largely heuristic, lacking theoretical foundations. In addition, the optimal learning rate schedule requires extensive trial-and-error selection, making the training process inefficient. In this work, we propose the Unified Budget-Aware (UBA) schedule, a theoretically grounded learning rate schedule that consistently outperforms commonly-used schedules among diverse architectures and tasks under different constrained training budgets. First, we bridge the gap by constructing a novel training budget-aware optimization framework, which explicitly accounts for the robustness to landscape curvature variations. From this framework, we derive the UBA schedule, controlled by a single hyper-parameter φthat provides a trade-off between flexibility and simplicity, eliminating the need for per-network numerical optimization. Moreover, we establish a theoretical connection between φand the condition number, adding interpretation and justification to our approach. Besides, we prove the convergence for different values of φ. We offer practical guidelines for its selection via theoretical analysis and empirical results. Extensive experimental results show that UBA consistently surpasses the commonly-used schedules across diverse vision and language tasks, spanning network architectures (e.g., ResNet, OLMo) and scales, under different training-iteration budgets.


Evaluating the robustness of adversarial defenses in malware detection systems

arXiv.org Artificial Intelligence

Machine learning is a key tool for Android malware detection, effectively identifying malicious patterns in apps. However, ML-based detectors are vulnerable to evasion attacks, where small, crafted changes bypass detection. Despite progress in adversarial defenses, the lack of comprehensive evaluation frameworks in binary-constrained domains limits understanding of their robustness. We introduce two key contributions. First, Prioritized Binary Rounding, a technique to convert continuous perturbations into binary feature spaces while preserving high attack success and low perturbation size. Second, the sigma-binary attack, a novel adversarial method for binary domains, designed to achieve attack goals with minimal feature changes. Experiments on the Malscan dataset show that sigma-binary outperforms existing attacks and exposes key vulnerabilities in state-of-the-art defenses. Defenses equipped with adversary detectors, such as KDE, DLA, DNN+, and ICNN, exhibit significant brittleness, with attack success rates exceeding 90% using fewer than 10 feature modifications and reaching 100% with just 20. Adversarially trained defenses, including AT-rFGSM-k, AT-MaxMA, improves robustness under small budgets but remains vulnerable to unrestricted perturbations, with attack success rates of 99.45% and 96.62%, respectively. Although PAD-SMA demonstrates strong robustness against state-of-the-art gradient-based adversarial attacks by maintaining an attack success rate below 16.55%, the sigma-binary attack significantly outperforms these methods, achieving a 94.56% success rate under unrestricted perturbations. These findings highlight the critical need for precise method like sigma-binary to expose hidden vulnerabilities in existing defenses and support the development of more resilient malware detection systems.


Safe Model Predictive Diffusion with Shielding

arXiv.org Artificial Intelligence

Abstract-- Generating safe, kinodynamically feasible, and optimal trajectories for complex robotic systems is a central challenge in robotics. This paper presents Safe Model Predictive Diffusion (Safe MPD), a training-free diffusion planner that unifies a model-based diffusion framework with a safety shield to generate trajectories that are both kinodynamically feasible and safe by construction. By enforcing feasibility and safety on all samples during the denoising process, our method avoids the common pitfalls of post-processing corrections, such as computational intractability and loss of feasibility. The results show that it substantially outperforms existing safety strategies in success rate and safety, while achieving sub-second computation times. I. INTRODUCTION Trajectory optimization is a cornerstone of robotics, enabling autonomous systems to generate goal-oriented motions consistent with their dynamics.


Probabilistic Weapon Engagement Zones for a Turn Constrained Pursuer

arXiv.org Artificial Intelligence

Curve-straight probabilistic engagement zones (CSPEZ) quantify the spatial regions an evader should avoid to reduce capture risk from a turn-rate-limited pursuer following a curve-straight path with uncertain parameters including position, heading, velocity, range, and maximum turn rate. This paper presents methods for generating evader trajectories that minimize capture risk under such uncertainty. We first derive an analytic solution for the deterministic curve-straight basic engagement zone (CSBEZ), then extend this formulation to a probabilistic framework using four uncertainty-propagation approaches: Monte Carlo sampling, linearization, quadratic approximation, and neural-network regression. We evaluate the accuracy and computational cost of each approximation method and demonstrate how CSPEZ constraints can be integrated into a trajectory-optimization algorithm to produce safe paths that explicitly account for pursuer uncertainty.


Physics Enhanced Deep Surrogates for the Phonon Boltzmann Transport Equation

arXiv.org Artificial Intelligence

Designing materials with controlled heat flow at the nano-scale is central to advances in microelectronics, thermoelectrics, and energy-conversion technologies. At these scales, phonon transport follows the Boltzmann Transport Equation (BTE), which captures non-diffusive (ballistic) effects but is too costly to solve repeatedly in inverse-design loops. Existing surrogate approaches trade speed for accuracy: fast macroscopic solvers can overestimate conductivities by hundreds of percent, while recent data-driven operator learners often require thousands of high-fidelity simulations. This creates a need for a fast, data-efficient surrogate that remains reliable across ballistic and diffusive regimes. We introduce a Physics-Enhanced Deep Surrogate (PEDS) that combines a differentiable Fourier solver with a neural generator and couples it with uncertainty-driven active learning. The Fourier solver acts as a physical inductive bias, while the network learns geometry-dependent corrections and a mixing coefficient that interpolates between macroscopic and nano-scale behavior. PEDS reduces training-data requirements by up to 70% compared with purely data-driven baselines, achieves roughly 5% fractional error with only 300 high-fidelity BTE simulations, and enables efficient design of porous geometries spanning 12-85 W m$^{-1}$ K$^{-1}$ with average design errors of 4%. The learned mixing parameter recovers the ballistic-diffusive transition and improves out of distribution robustness. These results show that embedding simple, differentiable low-fidelity physics can dramatically increase surrogate data-efficiency and interpretability, making repeated PDE-constrained optimization practical for nano-scale thermal-materials design.


A Multi-objective Optimization Approach for Feature Selection in Gentelligent Systems

arXiv.org Artificial Intelligence

Abstract--The integration of advanced technologies, such as Artificial Intelligence (AI), into manufacturing processes is attracting significant attention, paving the way for the development of intelligent systems that enhance efficiency and automation. This paper uses the term "Gentelligent system" to refer to systems that incorporate inherent component information (akin to genes in bioinformatics--where manufacturing operations are likened to chromosomes in this study) and automated mechanisms. By implementing reliable fault detection methods, manufacturers can achieve several benefits, including improved product quality, increased yield, and reduced production costs. T o support these objectives, we propose a hybrid framework with a dominance-based multi-objective evolutionary algorithm. This mechanism enables simultaneous optimization of feature selection and classification performance by exploring Pareto-optimal solutions in a single run. This solution helps monitor various manufacturing operations, addressing a range of conflicting objectives that need to be minimized together . Manufacturers can leverage such predictive methods and better adapt to emerging trends. T o strengthen the validation of our model, we incorporate two real-world datasets from different industrial domains. The results on both datasets demonstrate the generalizability and effectiveness of our approach. ORE recently, manufacturing has embraced the Industrial Internet of Things (IIoT), where digital sensors, network technologies, and gentelligent components are integrated into manufacturing processes. A gentelligent component, as defined in the Collaborative Research Centre 653 project [1], refers to components that intrinsically store information. The focus of that work is on encoding and preserving data within physical parts throughout the product lifecycle. Inspired by this concept, we extend the notion into what we define as a "gentelligent system."


Guided Query Refinement: Multimodal Hybrid Retrieval with Test-Time Optimization

arXiv.org Artificial Intelligence

Multimodal encoders have pushed the boundaries of visual document retrieval, matching textual query tokens directly to image patches and achieving state-of-the-art performance on public benchmarks. Recent models relying on this paradigm have massively scaled the sizes of their query and document representations, presenting obstacles to deployment and scalability in real-world pipelines. Furthermore, purely vision-centric approaches may be constrained by the inherent modality gap still exhibited by modern vision-language models. In this work, we connect these challenges to the paradigm of hybrid retrieval, investigating whether a lightweight dense text retriever can enhance a stronger vision-centric model. Existing hybrid methods, which rely on coarse-grained fusion of ranks or scores, fail to exploit the rich interactions within each model's representation space. To address this, we introduce Guided Query Refinement (GQR), a novel test-time optimization method that refines a primary retriever's query embedding using guidance from a complementary retriever's scores. Through extensive experiments on visual document retrieval benchmarks, we demonstrate that GQR allows vision-centric models to match the performance of models with significantly larger representations, while being up to 14x faster and requiring 54x less memory. Our findings show that GQR effectively pushes the Pareto frontier for performance and efficiency in multimodal retrieval. We release our code at https://github.com/IBM/test-time-hybrid-retrieval


Consequences of Kernel Regularity for Bandit Optimization

arXiv.org Machine Learning

In this work we investigate the relationship between kernel regularity and algorithmic performance in the bandit optimization of RKHS functions. While reproducing kernel Hilbert space (RKHS) methods traditionally rely on global kernel regressors, it is also common to use a smoothness-based approach that exploits local approximations. We show that these perspectives are deeply connected through the spectral properties of isotropic kernels. In particular, we characterize the Fourier spectra of the Matérn, square-exponential, rational-quadratic, $γ$-exponential, piecewise-polynomial, and Dirichlet kernels, and show that the decay rate determines asymptotic regret from both viewpoints. For kernelized bandit algorithms, spectral decay yields upper bounds on the maximum information gain, governing worst-case regret, while for smoothness-based methods, the same decay rates establish Hölder space embeddings and Besov space norm-equivalences, enabling local continuity analysis. These connections show that kernel-based and locally adaptive algorithms can be analyzed within a unified framework. This allows us to derive explicit regret bounds for each kernel family, obtaining novel results in several cases and providing improved analysis for others. Furthermore, we analyze LP-GP-UCB, an algorithm that combines both approaches, augmenting global Gaussian process surrogates with local polynomial estimators. While the hybrid approach does not uniformly dominate specialized methods, it achieves order-optimality across multiple kernel families.