Optimization
Learning with Positive and Imperfect Unlabeled Data
Lee, Jane H., Mehrotra, Anay, Zampetakis, Manolis
We study the problem of learning binary classifiers from positive and unlabeled data when the unlabeled data distribution is shifted, which we call Positive and Imperfect Unlabeled (PIU) Learning. In the absence of covariate shifts, i.e., with perfect unlabeled data, Denis (1998) reduced this problem to learning under Massart noise; however, that reduction fails under even slight shifts. Our main results on PIU learning are the characterizations of the sample complexity of PIU learning and a computationally and sample-efficient algorithm achieving a misclassification error $\varepsilon$. We further show that our results lead to new algorithms for several related problems. 1. Learning from smooth distributions: We give algorithms that learn interesting concept classes from only positive samples under smooth feature distributions, bypassing known existing impossibility results and contributing to recent advances in smoothened learning (Haghtalab et al, J.ACM'24) (Chandrasekaran et al., COLT'24). 2. Learning with a list of unlabeled distributions: We design new algorithms that apply to a broad class of concept classes under the assumption that we are given a list of unlabeled distributions, one of which--unknown to the learner--is $O(1)$-close to the true feature distribution. 3. Estimation in the presence of unknown truncation: We give the first polynomial sample and time algorithm for estimating the parameters of an exponential family distribution from samples truncated to an unknown set approximable by polynomials in $L_1$-norm. This improves the algorithm by Lee et al. (FOCS'24) that requires approximation in $L_2$-norm. 4. Detecting truncation: We present new algorithms for detecting whether given samples have been truncated (or not) for a broad class of non-product distributions, including non-product distributions, improving the algorithm by De et al. (STOC'24).
Towards Weaker Variance Assumptions for Stochastic Optimization
Alacaoglu, Ahmet, Malitsky, Yura, Wright, Stephen J.
We revisit a classical assumption for analyzing stochastic gradient algorithms where the squared norm of the stochastic subgradient (or the variance for smooth problems) is allowed to grow as fast as the squared norm of the optimization variable. We contextualize this assumption in view of its inception in the 1960s, its seemingly independent appearance in the recent literature, its relationship to weakest-known variance assumptions for analyzing stochastic gradient algorithms, and its relevance in deterministic problems for non-Lipschitz nonsmooth convex optimization. We build on and extend a connection recently made between this assumption and the Halpern iteration. For convex nonsmooth, and potentially stochastic, optimization, we analyze horizon-free, anytime algorithms with last-iterate rates. For problems beyond simple constrained optimization, such as convex problems with functional constraints or regularized convex-concave min-max problems, we obtain rates for optimality measures that do not require boundedness of the feasible set.
Regularized infill criteria for multi-objective Bayesian optimization with application to aircraft design
Grapin, Robin, Diouane, Youssef, Morlier, Joseph, Bartoli, Nathalie, Lefebvre, Thierry, Saves, Paul, Bussemaker, Jasper
Bayesian optimization is an advanced tool to perform ecient global optimization It consists on enriching iteratively surrogate Kriging models of the objective and the constraints both supposed to be computationally expensive of the targeted optimization problem Nowadays efficient extensions of Bayesian optimization to solve expensive multiobjective problems are of high interest The proposed method in this paper extends the super efficient global optimization with mixture of experts SEGOMOE to solve constrained multiobjective problems To cope with the illposedness of the multiobjective inll criteria different enrichment procedures using regularization techniques are proposed The merit of the proposed approaches are shown on known multiobjective benchmark problems with and without constraints The proposed methods are then used to solve a biobjective application related to conceptual aircraft design with ve unknown design variables and three nonlinear inequality constraints The preliminary results show a reduction of the total cost in terms of function evaluations by a factor of 20 compared to the evolutionary algorithm NSGA-II.
Explainability and Continual Learning meet Federated Learning at the Network Edge
Tsouparopoulos, Thomas, Koutsopoulos, Iordanis
Explainability and Continual Learning meet Federated Learning at the Network Edge Thomas Tsouparopoulos and Iordanis Koutsopoulos Department of Informatics, Athens University of Economics and Business Athens, Greece (Invited paper) Abstract --As edge devices become more capable and pervasive in wireless networks, there is growing interest in leveraging their collective compute power for distributed learning. However, optimizing learning at the network edge entails unique challenges, particularly when moving beyond conventional settings and objectives. While Federated Learning (FL) has emerged as a key paradigm for distributed model training, critical challenges persist. First, existing approaches often overlook the tradeoff between predictive accuracy and interpretability. Second, they struggle to integrate inherently explainable models such as decision trees because their non-differentiable structure makes them not amenable to backpropagation-based training algorithms. Lastly, they lack meaningful mechanisms for continual Machine Learning (ML) model adaptation through Continual Learning (CL) in resource-limited environments. In this paper, we pave the way for a set of novel optimization problems that emerge in distributed learning at the network edge with wirelessly interconnected edge devices, and we identify key challenges and future directions. Specifically, we discuss how Multi-objective optimization (MOO) can be used to address the trade-off between predictive accuracy and explainability when using complex predictive models. Next, we discuss the implications of integrating inherently explainable tree-based models into distributed learning settings. Finally, we investigate how CL strategies can be effectively combined with FL to support adaptive, lifelong learning when limited-size buffers are used to store past data for retraining.
Accelerating Multi-Objective Collaborative Optimization of Doped Thermoelectric Materials via Artificial Intelligence
Zeng, Yuxuan, Xie, Wenhao, Cao, Wei, Peng, Tan, Hou, Yue, Wang, Ziyu, Shi, Jing
The thermoelectric performance of materials exhibits complex nonlinear dependencies on both elemental types and their proportions, rendering traditional trial-and-error approaches inefficient and time-consuming for material discovery. In this work, we present a deep learning model capable of accurately predicting thermoelectric properties of doped materials directly from their chemical formulas, achieving state-of-the-art performance. To enhance interpretability, we further incorporate sensitivity analysis techniques to elucidate how physical descriptors affect the thermoelectric figure of merit (zT). Moreover, we establish a coupled framework that integrates a surrogate model with a multi-objective genetic algorithm to efficiently explore the vast compositional space for high-performance candidates. Experimental validation confirms the discovery of a novel thermoelectric material with superior $zT$ values in the medium-temperature regime.
MacPilot is the magic wand for your Mac's hidden features
Your Mac is a marvel of modern engineering, offering a seamless blend of power and elegance. Yet, beneath its polished exterior lies a treasure trove of features that remain just out of reach for the average user. The MacPilot tool is designed to bridge this gap, granting you access to over 1,200 hidden functionalities without the need to delve into complex command lines or system files. Imagine being able to customize your Mac's Dock by adding spacers and smart stacks or choosing to display hidden files in Finder with ease. Perhaps you've wished to silence the startup chime for a more discreet boot-up or change the default format of your screenshots to better suit your workflow.
Surrogate-based optimization of system architectures subject to hidden constraints
Bussemaker, Jasper, Saves, Paul, Bartoli, Nathalie, Lefebvre, Thierry, Nagel, Bjรถrn
The exploration of novel architectures requires physics-based simulation due to a lack of prior experience to start from, which introduces two specific challenges for optimization algorithms: evaluations become more expensive (in time) and evaluations might fail. The former challenge is addressed by Surrogate-Based Optimization (SBO) algorithms, in particular Bayesian Optimization (BO) using Gaussian Process (GP) models. An overview is provided of how BO can deal with challenges specific to architecture optimization, such as design variable hierarchy and multiple objectives: specific measures include ensemble infills and a hierarchical sampling algorithm. Evaluations might fail due to non-convergence of underlying solvers or infeasible geometry in certain areas of the design space. Such failed evaluations, also known as hidden constraints, pose a particular challenge to SBO/BO, as the surrogate model cannot be trained on empty results. This work investigates various strategies for satisfying hidden constraints in BO algorithms. Three high-level strategies are identified: rejection of failed points from the training set, replacing failed points based on viable (non-failed) points, and predicting the failure region. Through investigations on a set of test problems including a jet engine architecture optimization problem, it is shown that best performance is achieved with a mixed-discrete GP to predict the Probability of Viability (PoV), and by ensuring selected infill points satisfy some minimum PoV threshold. This strategy is demonstrated by solving a jet engine architecture problem that features at 50% failure rate and could not previously be solved by a BO algorithm. The developed BO algorithm and used test problems are available in the open-source Python library SBArchOpt.
Bayesian optimization for mixed variables using an adaptive dimension reduction process: applications to aircraft design
Saves, Paul, Bartoli, Nathalie, Diouane, Youssef, Lefebvre, Thierry, Morlier, Joseph, David, Christophe, Van, Eric Nguyen, Defoort, Sรฉbastien
Multidisciplinary design optimization methods aim at adapting numerical optimization techniques to the design of engineering systems involving multiple disciplines. In this context, a large number of mixed continuous, integer and categorical variables might arise during the optimization process and practical applications involve a large number of design variables. Recently, there has been a growing interest in mixed variables constrained Bayesian optimization but most existing approaches severely increase the number of the hyperparameters related to the surrogate model. In this paper, we address this issue by constructing surrogate models using less hyperparameters. The reduction process is based on the partial least squares method. An adaptive procedure for choosing the number of hyperparameters is proposed. The performance of the proposed approach is confirmed on analytical tests as well as two real applications related to aircraft design. A significant improvement is obtained compared to genetic algorithms.
An Incremental Non-Linear Manifold Approximation Method
Hettige, Praveen T. W., Ong, Benjamin W.
Analyzing high-dimensional data presents challenges due to the "curse of dimensionality'', making computations intensive. Dimension reduction techniques, categorized as linear or non-linear, simplify such data. Non-linear methods are particularly essential for efficiently visualizing and processing complex data structures in interactive and graphical applications. This research develops an incremental non-linear dimension reduction method using the Geometric Multi-Resolution Analysis (GMRA) framework for streaming data. The proposed method enables real-time data analysis and visualization by incrementally updating the cluster map, PCA basis vectors, and wavelet coefficients. Numerical experiments show that the incremental GMRA accurately represents non-linear manifolds even with small initial samples and aligns closely with batch GMRA, demonstrating efficient updates and maintaining the multiscale structure. The findings highlight the potential of Incremental GMRA for real-time visualization and interactive graphics applications that require adaptive high-dimensional data representations.
Riemannian Optimization on Relaxed Indicator Matrix Manifold
Yuan, Jinghui, Xie, Fangyuan, Nie, Feiping, Li, Xuelong
The indicator matrix plays an important role in machine learning, but optimizing it is an NP-hard problem. We propose a new relaxation of the indicator matrix and prove that this relaxation forms a manifold, which we call the Relaxed Indicator Matrix Manifold (RIM manifold). Based on Riemannian geometry, we develop a Riemannian toolbox for optimization on the RIM manifold. Specifically, we provide several methods of Retraction, including a fast Retraction method to obtain geodesics. We point out that the RIM manifold is a generalization of the double stochastic manifold, and it is much faster than existing methods on the double stochastic manifold, which has a complexity of \( \mathcal{O}(n^3) \), while RIM manifold optimization is \( \mathcal{O}(n) \) and often yields better results. We conducted extensive experiments, including image denoising, with millions of variables to support our conclusion, and applied the RIM manifold to Ratio Cut, we provide a rigorous convergence proof and achieve clustering results that outperform the state-of-the-art methods. Our Code in \href{https://github.com/Yuan-Jinghui/Riemannian-Optimization-on-Relaxed-Indicator-Matrix-Manifold}{here}.