Goto

Collaborating Authors

 Optimization


Posterior Inference in Latent Space for Scalable Constrained Black-box Optimization

arXiv.org Machine Learning

Optimizing high-dimensional black-box functions under black-box constraints is a pervasive task in a wide range of scientific and engineering problems. These problems are typically harder than unconstrained problems due to hard-to-find feasible regions. While Bayesian optimization (BO) methods have been developed to solve such problems, they often struggle with the curse of dimensionality. Recently, generative model-based approaches have emerged as a promising alternative for constrained optimization. However, they suffer from poor scalability and are vulnerable to mode collapse, particularly when the target distribution is highly multi-modal. In this paper, we propose a new framework to overcome these challenges. Our method iterates through two stages. First, we train flow-based models to capture the data distribution and surrogate models that predict both function values and constraint violations with uncertainty quantification. Second, we cast the candidate selection problem as a posterior inference problem to effectively search for promising candidates that have high objective values while not violating the constraints. During posterior inference, we find that the posterior distribution is highly multi-modal and has a large plateau due to constraints, especially when constraint feedback is given as binary indicators of feasibility. To mitigate this issue, we amortize the sampling from the posterior distribution in the latent space of flow-based models, which is much smoother than that in the data space. We empirically demonstrate that our method achieves superior performance on various synthetic and real-world constrained black-box optimization tasks. Our code is publicly available \href{https://github.com/umkiyoung/CiBO}{here}.


What Makes Local Updates Effective: The Role of Data Heterogeneity and Smoothness

arXiv.org Machine Learning

This thesis contributes to the theoretical understanding of local update algorithms, especially Local SGD, in distributed and federated optimization under realistic models of data heterogeneity. A central focus is on the bounded second-order heterogeneity assumption, which is shown to be both necessary and sufficient for local updates to outperform centralized or mini-batch methods in convex and non-convex settings. The thesis establishes tight upper and lower bounds in several regimes for various local update algorithms and characterizes the min-max complexity of multiple problem classes. At its core is a fine-grained consensus-error-based analysis framework that yields sharper finite-time convergence bounds under third-order smoothness and relaxed heterogeneity assumptions. The thesis also extends to online federated learning, providing fundamental regret bounds under both first-order and bandit feedback. Together, these results clarify when and why local updates offer provable advantages, and the thesis serves as a self-contained guide for analyzing Local SGD in heterogeneous environments.


Quantum Approximate Optimization Algorithm for Spatiotemporal Forecasting of HIV Clusters

arXiv.org Artificial Intelligence

HIV epidemiological data is increasingly complex, requiring advanced computation for accurate cluster detection and forecasting. We employed quantum-accelerated machine learning to analyze HIV prevalence at the ZIP-code level using AIDSVu and synthetic SDoH data for 2022. Our approach compared classical clustering (DBSCAN, HDBSCAN) with a quantum approximate optimization algorithm (QAOA), developed a hybrid quantum-classical neural network for HIV prevalence forecasting, and used quantum Bayesian networks to explore causal links between SDoH factors and HIV incidence. The QAOA-based method achieved 92% accuracy in cluster detection within 1.6 seconds, outperforming classical algorithms. Meanwhile, the hybrid quantum-classical neural network predicted HIV prevalence with 94% accuracy, surpassing a purely classical counterpart. Quantum Bayesian analysis identified housing instability as a key driver of HIV cluster emergence and expansion, with stigma exerting a geographically variable influence. These quantum-enhanced methods deliver greater precision and efficiency in HIV surveillance while illuminating critical causal pathways. This work can guide targeted interventions, optimize resource allocation for PrEP, and address structural inequities fueling HIV transmission.


Diffusion Classifier Guidance for Non-robust Classifiers

arXiv.org Artificial Intelligence

Classifier guidance is intended to steer a diffusion process such that a given classifier reliably recognizes the generated data point as a certain class. However, most classifier guidance approaches are restricted to robust classifiers, which were specifically trained on the noise of the diffusion forward process. We extend classifier guidance to work with general, non-robust, classifiers that were trained without noise. We analyze the sensitivity of both non-robust and robust classifiers to noise of the diffusion process on the standard CelebA data set, the specialized SportBalls data set and the high-dimensional real-world CelebA-HQ data set. Our findings reveal that non-robust classifiers exhibit significant accuracy degradation under noisy conditions, leading to unstable guidance gradients. To mitigate these issues, we propose a method that utilizes one-step denoised image predictions and implements stabilization techniques inspired by stochastic optimization methods, such as exponential moving averages. Experimental results demonstrate that our approach improves the stability of classifier guidance while maintaining sample diversity and visual quality. This work contributes to advancing conditional sampling techniques in generative models, enabling a broader range of classifiers to be used as guidance classifiers.


Parallel Transmission Aware Co-Design: Enhancing Manipulator Performance Through Actuation-Space Optimization

arXiv.org Artificial Intelligence

In robotics, structural design and behavior optimization have long been considered separate processes, resulting in the development of systems with limited capabilities. Recently, co-design methods have gained popularity, where bi-level formulations are used to simultaneously optimize the robot design and behavior for specific tasks. However, most implementations assume a serial or tree-type model of the robot, overlooking the fact that many robot platforms incorporate parallel mechanisms. In this paper, we present a novel co-design approach that explicitly incorporates parallel coupling constraints into the dynamic model of the robot. In this framework, an outer optimization loop focuses on the design parameters, in our case the transmission ratios of a parallel belt-driven manipulator, which map the desired torques from the joint space to the actuation space. An inner loop performs trajectory optimization in the actuation space, thus exploiting the entire dynamic range of the manipulator. We compare the proposed method with a conventional co-design approach based on a simplified tree-type model. By taking advantage of the actuation space representation, our approach leads to a significant increase in dynamic payload capacity compared to the conventional co-design implementation.


A collaborative digital twin built on FAIR data and compute infrastructure

arXiv.org Artificial Intelligence

The integration of machine learning with automated experimentation in self-driving laboratories (SDL) offers a powerful approach to accelerate discovery and optimization tasks in science and engineering applications. When supported by findable, accessible, interoperable, and reusable (FAIR) data infrastructure, SDLs with overlapping interests can collaborate more effectively. This work presents a distributed SDL implementation built on nanoHUB services for online simulation and FAIR data management. In this framework, geographically dispersed collaborators conducting independent optimization tasks contribute raw experimental data to a shared central database. These researchers can then benefit from analysis tools and machine learning models that automatically update as additional data become available. New data points are submitted through a simple web interface and automatically processed using a nanoHUB Sim2L, which extracts derived quantities and indexes all inputs and outputs in a FAIR data repository called ResultsDB. A separate nanoHUB workflow enables sequential optimization using active learning, where researchers define the optimization objective, and machine learning models are trained on-the-fly with all existing data, guiding the selection of future experiments. Inspired by the concept of ``frugal twin", the optimization task seeks to find the optimal recipe to combine food dyes to achieve the desired target color. With easily accessible and inexpensive materials, researchers and students can set up their own experiments, share data with collaborators, and explore the combination of FAIR data, predictive ML models, and sequential optimization. The tools introduced are generally applicable and can easily be extended to other optimization problems.


Detection of coordinated fleet vehicles in route choice urban games. Part I. Inverse fleet assignment theory

arXiv.org Artificial Intelligence

Detection of collectively routing fleets of vehicles in future urban systems may become important for the management of traffic, as such routing may destabilize urban networks leading to deterioration of driving conditions. Accordingly, in this paper we discuss the question whether it is possible to determine the flow of fleet vehicles on all routes given the fleet size and behaviour as well as the combined total flow of fleet and non-fleet vehicles on every route. We prove that the answer to this Inverse Fleet Assignment Problem is 'yes' for myopic fleet strategies which are more 'selfish' than 'altruistic', and 'no' otherwise, under mild assumptions on route/link performance functions. To reach these conclusions we introduce the forward fleet assignment operator and study its properties, proving that it is invertible for 'bad' objectives of fleet controllers. We also discuss the challenges of implementing myopic fleet routing in the real world and compare it to Stackelberg and Nash routing. Finally, we show that optimal Stackelberg fleet routing could involve highly variable mixed strategies in some scenarios, which would likely cause chaos in the traffic network.


Resilient-Native and Intelligent Next-Generation Wireless Systems: Key Enablers, Foundations, and Applications

arXiv.org Artificial Intelligence

Just like power, water, and transportation systems, wireless networks are a crucial societal infrastructure. As natural and human-induced disruptions continue to grow, wireless networks must be resilient. This requires them to withstand and recover from unexpected adverse conditions, shocks, unmodeled disturbances and cascading failures. Unlike robustness and reliability, resilience is based on the understanding that disruptions will inevitably happen. Resilience, as elasticity, focuses on the ability to bounce back to favorable states, while resilience as plasticity involves agents and networks that can flexibly expand their states and hypotheses through real-time adaptation and reconfiguration. This situational awareness and active preparedness, adapting world models and counterfactually reasoning about potential system failures and the best responses, is a core aspect of resilience. This article will first disambiguate resilience from reliability and robustness, before delving into key mathematical foundations of resilience grounded in abstraction, compositionality and emergence. Subsequently, we focus our attention on a plethora of techniques and methodologies pertaining to the unique characteristics of resilience, as well as their applications through a comprehensive set of use cases. Ultimately, the goal of this paper is to establish a unified foundation for understanding, modeling, and engineering resilience in wireless communication systems, while laying a roadmap for the next-generation of resilient-native and intelligent wireless systems.


Differentiable Radar Ambiguity Functions: Mathematical Formulation and Computational Implementation

arXiv.org Artificial Intelligence

The ambiguity function is fundamental to radar waveform design, characterizing range and Doppler resolution capabilities. However, its traditional formulation involves non-differentiable operations, preventing integration with gradient-based optimization methods and modern machine learning frameworks. This paper presents the first complete mathematical framework and computational implementation for differentiable radar ambiguity functions. Our approach addresses the fundamental technical challenges that have prevented the radar community from leveraging automatic differentiation: proper handling of complex-valued gradients using Wirtinger calculus, efficient computation through parallelized FFT operations, numerical stability throughout cascaded operations, and composability with arbitrary differentiable operations. We term this approach GRAF (Gradient-based Radar Ambiguity Functions), which reformulates the ambiguity function computation to maintain mathematical equivalence while enabling gradient flow through the entire pipeline. The resulting implementation provides a general-purpose differentiable ambiguity function compatible with modern automatic differentiation frameworks, enabling new research directions including neural network-based waveform generation with ambiguity constraints, end-to-end optimization of radar systems, and integration of classical radar theory with modern deep learning. We provide complete implementation details and demonstrate computational efficiency suitable for practical applications. This work establishes the mathematical and computational foundation for applying modern machine learning techniques to radar waveform design, bridging classical radar signal processing with automatic differentiation frameworks.


Scalable Dynamic Origin-Destination Demand Estimation Enhanced by High-Resolution Satellite Imagery Data

arXiv.org Artificial Intelligence

This study presents a novel integrated framework for dynamic origin-destination demand estimation (DODE) in multi-class mesoscopic network models, leveraging high-resolution satellite imagery together with conventional tra ffic data from local sensors. To extract information from imagery data, we design a computer vision pipeline for class-specific vehicle detection and map matching, generating link-level tra ffic density observations by vehicle class. Building upon this information, we formulate a computational graph-based DODE model that calibrates dynamic network states by jointly matching observed tra ffic counts and travel times from local sensors with density measurements derived from satellite imagery. To assess the accuracy and scalability of the proposed framework, we conduct a series of numerical experiments using both synthetic and real-world data. The results of out-of-sample tests demonstrate that supplementing traditional data with satellite-derived density significantly improves estimation performance, especially for links without local sensors. Real-world experiments also confirm the framework's capability to handle large-scale networks, supporting its potential for practical deployment in cities of varying sizes. Sensitivity analysis further evaluates the impact of data quality related to satellite imagery data. Introduction The widespread availability of spatio-temporal data has created new opportunities for advancing computational tools to model network flows, individual traveler behavior, and travel demand in dynamic transportation networks. Recent developments in sensing technologies and artificial intelligence are revolutionizing traditional models, making them more data-driven, scalable, and e ff ective for complex, large-scale networks. Dynamic Origin-destination Demand Estimation (DODE) is a foundational prerequisite for dynamic network models to accurately reproduce the status quo spatio-temporal network conditions, supporting tra ffic assignment (Pi et al. 2019) and control strategies (Y e et al. 2019, Liu, Ma & Qian 2023, Ke et al. 2025). DODE studies can be broadly categorized into model-based methods, which embed physics-informed tra ffic assignment models, and model-free methods, which formulate the problem using data-driven techniques without tra ffic assignment constraints.