Goto

Collaborating Authors

 Optimization


Enhancing Electricity-System Resilience with Adaptive Robust Optimization and Conformal Uncertainty Characterization

arXiv.org Artificial Intelligence

Extreme weather is straining electricity systems, exposing the limitations of reactive responses, and prompting the need for proactive resilience planning. Most existing approaches to enhance electricity system resilience employ simplified uncertainty models and decouple proactive and reactive decisions. This paper proposes a novel tri-level optimization model that integrates proactive actions, adversarial disruptions, and reactive responses. Conformal prediction is used to construct distribution-free system-disruption uncertainty sets with coverage guarantees. The tri-level problem is solved by using duality theory to derive a bi-level reformulation and employing Bender's decomposition. Numerical experiments demonstrate that our approach outperforms conventional robust and two-stage methods.


Statistical Uncertainty Learning for Robust Visual-Inertial State Estimation

arXiv.org Artificial Intelligence

A fundamental challenge in robust visual-inertial odometry (VIO) is to dynamically assess the reliability of sensor measurements. This assessment is crucial for properly weighting the contribution of each measurement to the state estimate. Conventional methods often simplify this by assuming a static, uniform uncertainty for all measurements. This heuristic, however, may be limited in its ability to capture the dynamic error characteristics inherent in real-world data. To improve this limitation, we present a statistical framework that learns measurement reliability assessment online, directly from sensor data and optimization results. Our approach leverages multi-view geometric consistency as a form of self-supervision. This enables the system to infer landmark uncertainty and adaptively weight visual measurements during optimization. We evaluated our method on the public EuRoC dataset, demonstrating improvements in tracking accuracy with average reductions of approximately 24\% in translation error and 42\% in rotation error compared to baseline methods with fixed uncertainty parameters. The resulting framework operates in real time while showing enhanced accuracy and robustness. To facilitate reproducibility and encourage further research, the source code will be made publicly available.


Template-Guided 3D Molecular Pose Generation via Flow Matching and Differentiable Optimization

arXiv.org Artificial Intelligence

Predicting the 3D conformation of small molecules within protein binding sites is a key challenge in drug design. When a crystallized reference ligand (template) is available, it provides geometric priors that can guide 3D pose prediction. We present a two-stage method for ligand conformation generation guided by such templates. In the first stage, we introduce a molecular alignment approach based on flow-matching to generate 3D coordinates for the ligand, using the template structure as a reference. In the second stage, a differentiable pose optimization procedure refines this conformation based on shape and pharmacophore similarities, internal energy, and, optionally, the protein binding pocket. We introduce a new benchmark of ligand pairs co-crystallized with the same target to evaluate our approach and show that it outperforms standard docking tools and open-access alignment methods, especially in cases involving low similarity to the template or high ligand flexibility.


Learning Low-Dimensional Embeddings for Black-Box Optimization

arXiv.org Artificial Intelligence

Black-box optimization (BBO) aims to find the solution of an optimization problem where the objective function is unknown or lacks an explicit mathematical formulation. Instead, the function can only be evaluated through queries, such as physical experiments or simulations through complex computational models. However, in many real-world scenarios, these evaluations are expensive, noisy, or time-consuming. Therefore, a key goal of black-box optimization is to find near-optimal solutions while limiting the number of costly function evaluations. Due to these challenges, BBO relies on sample-efficient strategies that select query points to balance exploration (searching unexplored regions) and exploitation (refining promising solutions).


Grammar-based Ordinary Differential Equation Discovery

arXiv.org Artificial Intelligence

The understanding and modeling of complex physical phenomena through dynamical systems has historically driven scientific progress, as it provides the tools for predicting the behavior of different systems under diverse conditions through time. The discovery of dynamical systems has been indispensable in engineering, as it allows for the analysis and prediction of complex behaviors for computational modeling, diagnostics, prognostics, and control of engineered systems. Joining recent efforts that harness the power of symbolic regression in this domain, we propose a novel framework for the end-to-end discovery of ordinary differential equations (ODEs), termed Grammar-based ODE Discovery Engine (GODE). The proposed methodology combines formal grammars with dimensionality reduction and stochastic search for efficiently navigating high-dimensional combinatorial spaces. Grammars allow us to seed domain knowledge and structure for both constraining, as well as, exploring the space of candidate expressions. GODE proves to be more sample- and parameter-efficient than state-of-the-art transformer-based models and to discover more accurate and parsimonious ODE expressions than both genetic programming- and other grammar-based methods for more complex inference tasks, such as the discovery of structural dynamics. Thus, we introduce a tool that could play a catalytic role in dynamics discovery tasks, including modeling, system identification, and monitoring tasks.


DAG DECORation: Continuous Optimization for Structure Learning under Hidden Confounding

arXiv.org Artificial Intelligence

We study structure learning for linear Gaussian SEMs in the presence of latent confounding. Existing continuous methods excel when errors are independent, while deconfounding-first pipelines rely on pervasive factor structure or nonlinearity. We propose \textsc{DECOR}, a single likelihood-based and fully differentiable estimator that jointly learns a DAG and a correlated noise model. Our theory gives simple sufficient conditions for global parameter identifiability: if the mixed graph is bow free and the noise covariance has a uniform eigenvalue margin, then the map from $(\B,\OmegaMat)$ to the observational covariance is injective, so both the directed structure and the noise are uniquely determined. The estimator alternates a smooth-acyclic graph update with a convex noise update and can include a light bow complementarity penalty or a post hoc reconciliation step. On synthetic benchmarks that vary confounding density, graph density, latent rank, and dimension with $n


Latency-aware Multimodal Federated Learning over UAV Networks

arXiv.org Artificial Intelligence

This paper investigates federated multimodal learning (FML) assisted by unmanned aerial vehicles (UAVs) with a focus on minimizing system latency and providing convergence analysis. In this framework, UAVs are distributed throughout the network to collect data, participate in model training, and collaborate with a base station (BS) to build a global model. By utilizing multimodal sensing, the UAVs overcome the limitations of unimodal systems, enhancing model accuracy, generalization, and offering a more comprehensive understanding of the environment. The primary objective is to optimize FML system latency in UAV networks by jointly addressing UAV sensing scheduling, power control, trajectory planning, resource allocation, and BS resource management. To address the computational complexity of our latency minimization problem, we propose an efficient iterative optimization algorithm combining block coordinate descent and successive convex approximation techniques, which provides high-quality approximate solutions. We also present a theoretical convergence analysis for the UAV-assisted FML framework under a non-convex loss function. Numerical experiments demonstrate that our FML framework outperforms existing approaches in terms of system latency and model training performance under different data settings.


Representational Alignment Across Model Layers and Brain Regions with Hierarchical Optimal Transport

arXiv.org Artificial Intelligence

Standard representational similarity methods align each layer of a network to its best match in another independently, producing asymmetric results, lacking a global alignment score, and struggling with networks of different depths. These limitations arise from ignoring global activation structure and restricting mappings to rigid one-to-one layer correspondences. We propose Hierarchical Optimal Transport (HOT), a unified framework that jointly infers soft, globally consistent layer-to-layer couplings and neuron-level transport plans. HOT allows source neurons to distribute mass across multiple target layers while minimizing total transport cost under marginal constraints. This yields both a single alignment score for the entire network comparison and a soft transport plan that naturally handles depth mismatches through mass distribution. We evaluate HOT on vision models, large language models, and human visual cortex recordings. Across all domains, HOT matches or surpasses standard pairwise matching in alignment quality. Moreover, it reveals smooth, fine-grained hierarchical correspondences: early layers map to early layers, deeper layers maintain relative positions, and depth mismatches are resolved by distributing representations across multiple layers. These structured patterns emerge naturally from global optimization without being imposed, yet are absent in greedy layer-wise methods. HOT thus enables richer, more interpretable comparisons between representations, particularly when networks differ in architecture or depth.


PASTA: A Unified Framework for Offline Assortment Learning

arXiv.org Artificial Intelligence

We study a broad class of assortment optimization problems in an offline and data-driven setting. In such problems, a firm lacks prior knowledge of the underlying choice model, and aims to determine an optimal assortment based on historical customer choice data. The combinatorial nature of assortment optimization often results in insufficient data coverage, posing a significant challenge in designing provably effective solutions. To address this, we introduce a novel Pessimistic Assortment Optimization (PASTA) framework that leverages the principle of pessimism to achieve optimal expected revenue under general choice models. Notably, PASTA requires only that the offline data distribution contains an optimal assortment, rather than providing the full coverage of all feasible assortments. Theoretically, we establish the first finite-sample regret bounds for offline assortment optimization across several widely used choice models, including the multinomial logit and nested logit models. Additionally, we derive a minimax regret lower bound, proving that PASTA is minimax optimal in terms of sample and model complexity. Numerical experiments further demonstrate that our method outperforms existing baseline approaches.


The Unseen Frontier: Pushing the Limits of LLM Sparsity with Surrogate-Free ADMM

arXiv.org Artificial Intelligence

Neural network pruning is a promising technique to mitigate the excessive computational and memory requirements of large language models (LLMs). Despite its promise, however, progress in this area has diminished, as conventional methods are seemingly unable to surpass moderate sparsity levels (50-60%) without severely degrading model accuracy. This work breaks through the current impasse, presenting a principled and effective method called $\texttt{Elsa}$, which achieves extreme sparsity levels of up to 90% while retaining high model fidelity. This is done by identifying several limitations in current practice, all of which can be traced back to their reliance on a surrogate objective formulation. $\texttt{Elsa}$ tackles this issue directly and effectively via standard and well-established constrained optimization techniques based on ADMM. Our extensive experiments across a wide range of models and scales show that $\texttt{Elsa}$ achieves substantial improvements over existing methods; e.g., it achieves 7.8$\times$ less perplexity than the best existing method on LLaMA-2-7B at 90% sparsity. Furthermore, we present $\texttt{Elsa}_{\text{-L}}$, a quantized variant that scales to extremely large models (27B), and establish its theoretical convergence guarantees. These results highlight meaningful progress in advancing the frontier of LLM sparsity, while promising that significant opportunities for further advancement may remain in directions that have so far attracted limited exploration.