Goto

Collaborating Authors

 Optimization


Understanding and Exploiting Plasticity for Non-stationary Network Resource Adaptation

arXiv.org Artificial Intelligence

Adapting to non-stationary network conditions presents significant challenges for resource adaptation. However, current solutions primarily rely on stationary assumptions. While data-driven reinforcement learning approaches offer promising solutions for handling network dynamics, our systematic investigation reveals a critical limitation: neural networks suffer from plasticity loss, significantly impeding their ability to adapt to evolving network conditions. Through theoretical analysis of neural propagation mechanisms, we demonstrate that existing dormant neuron metrics inadequately characterize neural plasticity loss. To address this limitation, we have developed the Silent Neuron theory, which provides a more comprehensive framework for understanding plasticity degradation. Based on these theoretical insights, we propose the Reset Silent Neuron (ReSiN), which preserves neural plasticity through strategic neuron resets guided by both forward and backward propagation states. In our implementation of an adaptive video streaming system, ReSiN has shown significant improvements over existing solutions, achieving up to 168% higher bitrate and 108% better quality of experience (QoE) while maintaining comparable smoothness. Furthermore, ReSiN consistently outperforms in stationary environments, demonstrating its robust adaptability across different network conditions.


Artificial Protozoa Optimizer (APO): A novel bio-inspired metaheuristic algorithm for engineering optimization

arXiv.org Artificial Intelligence

This study proposes a novel artificial protozoa optimizer (APO) that is inspired by protozoa in nature. The APO mimics the survival mechanisms of protozoa by simulating their foraging, dormancy, and reproductive behaviors. The APO was mathematically modeled and implemented to perform the optimization processes of metaheuristic algorithms. The performance of the APO was verified via experimental simulations and compared with 32 state-of-the-art algorithms. Wilcoxon signed-rank test was performed for pairwise comparisons of the proposed APO with the state-of-the-art algorithms, and Friedman test was used for multiple comparisons. First, the APO was tested using 12 functions of the 2022 IEEE Congress on Evolutionary Computation benchmark. Considering practicality, the proposed APO was used to solve five popular engineering design problems in a continuous space with constraints. Moreover, the APO was applied to solve a multilevel image segmentation task in a discrete space with constraints. The experiments confirmed that the APO could provide highly competitive results for optimization problems. The source codes of Artificial Protozoa Optimizer are publicly available at https://seyedalimirjalili.com/projects and https://ww2.mathworks.cn/matlabcentral/fileexchange/162656-artificial-protozoa-optimizer.


Nonnegative Low-rank Matrix Recovery Can Have Spurious Local Minima

arXiv.org Machine Learning

The classical low-rank matrix recovery problem is well-known to exhibit \emph{benign nonconvexity} under the restricted isometry property (RIP): local optimization is guaranteed to converge to the global optimum, where the ground truth is recovered. We investigate whether benign nonconvexity continues to hold when the factor matrices are constrained to be elementwise nonnegative -- a common practical requirement. In the simple setting of a rank-1 nonnegative ground truth, we confirm that benign nonconvexity holds in the fully-observed case with RIP constant $ฮด=0$. Surprisingly, however, this property fails to extend to the partially-observed case with any arbitrarily small RIP constant $ฮด\to0^{+}$, irrespective of rank overparameterization. This finding exposes a critical theoretical gap: the continuity argument widely used to explain the empirical robustness of low-rank matrix recovery fundamentally breaks down once nonnegative constraints are imposed.


GeoERM: Geometry-Aware Multi-Task Representation Learning on Riemannian Manifolds

arXiv.org Machine Learning

Multi-Task Learning (MTL) seeks to boost statistical power and learning efficiency by discovering structure shared across related tasks. State-of-the-art MTL representation methods, however, usually treat the latent representation matrix as a point in ordinary Euclidean space, ignoring its often non-Euclidean geometry, thus sacrificing robustness when tasks are heterogeneous or even adversarial. We propose GeoERM, a geometry-aware MTL framework that embeds the shared representation on its natural Riemannian manifold and optimizes it via explicit manifold operations. Each training cycle performs (i) a Riemannian gradient step that respects the intrinsic curvature of the search space, followed by (ii) an efficient polar retraction to remain on the manifold, guaranteeing geometric fidelity at every iteration. The procedure applies to a broad class of matrix-factorized MTL models and retains the same per-iteration cost as Euclidean baselines. Across a set of synthetic experiments with task heterogeneity and on a wearable-sensor activity-recognition benchmark, GeoERM consistently improves estimation accuracy, reduces negative transfer, and remains stable under adversarial label noise, outperforming leading MTL and single-task alternatives.


Prediction Models That Learn to Avoid Missing Values

arXiv.org Artificial Intelligence

Handling missing values at test time is challenging for machine learning models, especially when aiming for both high accuracy and interpretability. Established approaches often add bias through imputation or excessive model complexity via missingness indicators. Moreover, either method can obscure interpretability, making it harder to understand how the model utilizes the observed variables in predictions. We propose missingness-avoiding (MA) machine learning, a general framework for training models to rarely require the values of missing (or imputed) features at test time. We create tailored MA learning algorithms for decision trees, tree ensembles, and sparse linear models by incorporating classifier-specific regularization terms in their learning objectives. The tree-based models leverage contextual missingness by reducing reliance on missing values based on the observed context. Experiments on real-world datasets demonstrate that MA-DT, MA-LASSO, MA-RF, and MA-GBT effectively reduce the reliance on features with missing values while maintaining predictive performance competitive with their unregularized counterparts. This shows that our framework gives practitioners a powerful tool to maintain interpretability in predictions with test-time missing values.


A Real-Time Control Barrier Function-Based Safety Filter for Motion Planning with Arbitrary Road Boundary Constraints

arXiv.org Artificial Intelligence

We present a real-time safety filter for motion planning, such as learning-based methods, using Control Barrier Functions (CBFs), which provides formal guarantees for collision avoidance with road boundaries. A key feature of our approach is its ability to directly incorporate road geometries of arbitrary shape without resorting to conservative overapproximations. We formulate the safety filter as a constrained optimization problem in the form of a Quadratic Program (QP). It achieves safety by making minimal, necessary adjustments to the control actions issued by the nominal motion planner. We validate our safety filter through extensive numerical experiments across a variety of traffic scenarios featuring complex roads. The results confirm its reliable safety and high computational efficiency (execution frequency up to 40 Hz). Code & Video Demo: github.com/bassamlab/SigmaRL


Smooth Integer Encoding via Integral Balance

arXiv.org Artificial Intelligence

ORCID: 0000-0002-5891-8119 April 28, 2025 Abstract We introduce a novel method for encoding integers using smooth real-valued functions whose integral properties implicitly reflect discrete quantities. In contrast to classical representations, where the integer appears as an explicit parameter, our approach encodes the number N N through the cumulative balance of a smooth function f N(t), constructed from localized Gaussian bumps with alternating and decaying coefficients. The total integral I ( N) converges to zero as N, and the integer can be recovered as the minimal point of near-cancellation. This method enables continuous and differentiable representations of discrete states, supports recovery through spline-based or analytical inversion, and extends naturally to multidimensional tuples ( N 1, N 2, .. . We analyze the structure and convergence of the encoding series, demonstrate numerical construction of the integral map I (N), and develop procedures for integer recovery via numerical inversion. The resulting framework opens a path toward embedding discrete logic within continuous optimization pipelines, machine learning architectures, and smooth symbolic computation. Numerical Analysis 1 Introduction Representing discrete quantities such as integers within continuous mathematical frameworks is a central challenge in optimization, numerical analysis, and machine learning. Traditional symbolic representations and modern soft relaxation techniques both face fundamental limitations: the former lack differentiability, while the latter introduce approximation errors and auxiliary complexities. In this work, we propose a novel method for encoding integers through smooth real-valued functions whose integral properties implicitly reflect discrete quantities.


Pickup & Delivery with Time Windows and Transfers: combining decomposition with metaheuristics

arXiv.org Artificial Intelligence

This paper examines the generalisation of the Pickup and Delivery Problem that allows mid-route load exchanges among vehicles and obeys strict time-windows at all locations. We propose a novel Logic-Based Benders Decomposition (LBBD) that improves optimality gaps for all benchmarks in the literature and scales up to handle larger ones. To tackle even larger instances, we introduce a refined Large Neighborhood Search (LNS) algorithm that improves the adaptability of LNS beyond case-specific configurations appearing in related literature. To bridge the gap in benchmark availability, we develop an instance generator that allows for extensive experimentation. For moderate datasets (25 and 50 requests), we evaluate the performance of both LBBD and LNS, the former being able to close the gap and the latter capable of providing near-optimal solutions. For larger instances (75 and 100 requests), we recreate indicative state-of-the-art metaheuristics to highlight the improvements introduced by our LNS refinements, while establishing its scalability.


Efficient Curvature-Aware Hypergradient Approximation for Bilevel Optimization

arXiv.org Artificial Intelligence

Bilevel optimization is a powerful tool for many machine learning problems, such as hyperparameter optimization and meta-learning. Estimating hypergradients (also known as implicit gradients) is crucial for developing gradient-based methods for bilevel optimization. In this work, we propose a computationally efficient technique for incorporating curvature information into the approximation of hypergradients and present a novel algorithmic framework based on the resulting enhanced hypergradient computation. We provide convergence rate guarantees for the proposed framework in both deterministic and stochastic scenarios, particularly showing improved computational complexity over popular gradient-based methods in the deterministic setting. This improvement in complexity arises from a careful exploitation of the hypergradient structure and the inexact Newton method. In addition to the theoretical speedup, numerical experiments demonstrate the significant practical performance benefits of incorporating curvature information.


Multi-Scale Graph Learning for Anti-Sparse Downscaling

arXiv.org Artificial Intelligence

Water temperature can vary substantially even across short distances within the same sub-watershed. Accurate prediction of stream water temperature at fine spatial resolutions (i.e., fine scales, $\leq$ 1 km) enables precise interventions to maintain water quality and protect aquatic habitats. Although spatiotemporal models have made substantial progress in spatially coarse time series modeling, challenges persist in predicting at fine spatial scales due to the lack of data at that scale.To address the problem of insufficient fine-scale data, we propose a Multi-Scale Graph Learning (MSGL) method. This method employs a multi-task learning framework where coarse-scale graph learning, bolstered by larger datasets, simultaneously enhances fine-scale graph learning. Although existing multi-scale or multi-resolution methods integrate data from different spatial scales, they often overlook the spatial correspondences across graph structures at various scales. To address this, our MSGL introduces an additional learning task, cross-scale interpolation learning, which leverages the hydrological connectedness of stream locations across coarse- and fine-scale graphs to establish cross-scale connections, thereby enhancing overall model performance. Furthermore, we have broken free from the mindset that multi-scale learning is limited to synchronous training by proposing an Asynchronous Multi-Scale Graph Learning method (ASYNC-MSGL). Extensive experiments demonstrate the state-of-the-art performance of our method for anti-sparse downscaling of daily stream temperatures in the Delaware River Basin, USA, highlighting its potential utility for water resources monitoring and management.