Goto

Collaborating Authors

 Search


Adaptive Stabilization Based on Machine Learning for Column Generation

arXiv.org Artificial Intelligence

Column generation (CG) is a well-established method for solving large-scale linear programs. It involves iteratively optimizing a subproblem containing a subset of columns and using its dual solution to generate new columns with negative reduced costs. This process continues until the dual values converge to the optimal dual solution to the original problem. A natural phenomenon in CG is the heavy oscillation of the dual values during iterations, which can lead to a substantial slowdown in the convergence rate. Stabilization techniques are devised to accelerate the convergence of dual values by using information beyond the state of the current subproblem. However, there remains a significant gap in obtaining more accurate dual values at an earlier stage. To further narrow this gap, this paper introduces a novel approach consisting of 1) a machine learning approach for accurate prediction of optimal dual solutions and 2) an adaptive stabilization technique that effectively capitalizes on accurate predictions. On the graph coloring problem, we show that our method achieves a significantly improved convergence rate compared to traditional methods.


Preparing for Black Swans: The Antifragility Imperative for Machine Learning

arXiv.org Artificial Intelligence

Operating safely and reliably despite continual distribution shifts is vital for high-stakes machine learning applications. This paper builds upon the transformative concept of ``antifragility'' introduced by (Taleb, 2014) as a constructive design paradigm to not just withstand but benefit from volatility. We formally define antifragility in the context of online decision making as dynamic regret's strictly concave response to environmental variability, revealing limitations of current approaches focused on resisting rather than benefiting from nonstationarity. Our contribution lies in proposing potential computational pathways for engineering antifragility, grounding the concept in online learning theory and drawing connections to recent advancements in areas such as meta-learning, safe exploration, continual learning, multi-objective/quality-diversity optimization, and foundation models. By identifying promising mechanisms and future research directions, we aim to put antifragility on a rigorous theoretical foundation in machine learning. We further emphasize the need for clear guidelines, risk assessment frameworks, and interdisciplinary collaboration to ensure responsible application.


Efficient Line Search Method Based on Regression and Uncertainty Quantification

arXiv.org Artificial Intelligence

Unconstrained optimization problems are typically solved using iterative methods, which often depend on line search techniques to determine optimal step lengths in each iteration. This paper introduces a novel line search approach. Traditional line search methods, aimed at determining optimal step lengths, often discard valuable data from the search process and focus on refining step length intervals. This paper proposes a more efficient method using Bayesian optimization, which utilizes all available data points, i.e., function values and gradients, to guide the search towards a potential global minimum. This new approach more effectively explores the search space, leading to better solution quality. It is also easy to implement and integrate into existing frameworks. Tested on the challenging CUTEst test set, it demonstrates superior performance compared to existing state-of-the-art methods, solving more problems to optimality with equivalent resource usage.


Uncertainty Distribution Assessment of Jiles-Atherton Parameter Estimation for Inrush Current Studies

arXiv.org Artificial Intelligence

Transformers are one of the key assets in AC distribution grids and renewable power integration. During transformer energization inrush currents appear, which lead to transformer degradation and can cause grid instability events. These inrush currents are a consequence of the transformer's magnetic core saturation during its connection to the grid. Transformer cores are normally modelled by the Jiles-Atherton (JA) model which contains five parameters. These parameters can be estimated by metaheuristic-based search algorithms. The parameter initialization of these algorithms plays an important role in the algorithm convergence. The most popular strategy used for JA parameter initialization is a random uniform distribution. However, techniques such as parameter initialization by Probability Density Functions (PDFs) have shown to improve accuracy over random methods. In this context, this research work presents a framework to assess the impact of different parameter initialization strategies on the performance of the JA parameter estimation for inrush current studies. Depending on available data and expert knowledge, uncertainty levels are modelled with different PDFs. Moreover, three different metaheuristic-search algorithms are employed on two different core materials and their accuracy and computational time are compared. Results show an improvement in the accuracy and computational time of the metaheuristic-based algorithms when PDF parameter initialization is used.


Moreau Envelope for Nonconvex Bi-Level Optimization: A Single-loop and Hessian-free Solution Strategy

arXiv.org Artificial Intelligence

This work focuses on addressing two major challenges in the context of large-scale nonconvex Bi-Level Optimization (BLO) problems, which are increasingly applied in machine learning due to their ability to model nested structures. These challenges involve ensuring computational efficiency and providing theoretical guarantees. While recent advances in scalable BLO algorithms have primarily relied on lower-level convexity simplification, our work specifically tackles large-scale BLO problems involving nonconvexity in both the upper and lower levels. We simultaneously address computational and theoretical challenges by introducing an innovative single-loop gradient-based algorithm, utilizing the Moreau envelope-based reformulation, and providing non-asymptotic convergence analysis for general nonconvex BLO problems. Notably, our algorithm relies solely on first-order gradient information, enhancing its practicality and efficiency, especially for large-scale BLO learning tasks. We validate our approach's effectiveness through experiments on various synthetic problems, two typical hyper-parameter learning tasks, and a real-world neural architecture search application, collectively demonstrating its superior performance.


SEEK: Semantic Reasoning for Object Goal Navigation in Real World Inspection Tasks

arXiv.org Artificial Intelligence

This paper addresses the problem of object-goal navigation in autonomous inspections in real-world environments. Object-goal navigation is crucial to enable effective inspections in various settings, often requiring the robot to identify the target object within a large search space. Current object inspection methods fall short of human efficiency because they typically cannot bootstrap prior and common sense knowledge as humans do. In this paper, we introduce a framework that enables robots to use semantic knowledge from prior spatial configurations of the environment and semantic common sense knowledge. We propose SEEK (Semantic Reasoning for Object Inspection Tasks) that combines semantic prior knowledge with the robot's observations to search for and navigate toward target objects more efficiently. SEEK maintains two representations: a Dynamic Scene Graph (DSG) and a Relational Semantic Network (RSN). The RSN is a compact and practical model that estimates the probability of finding the target object across spatial elements in the DSG. We propose a novel probabilistic planning framework to search for the object using relational semantic knowledge. Our simulation analyses demonstrate that SEEK outperforms the classical planning and Large Language Models (LLMs)-based methods that are examined in this study in terms of efficiency for object-goal inspection tasks. We validated our approach on a physical legged robot in urban environments, showcasing its practicality and effectiveness in real-world inspection scenarios.


Navigating Public Sentiment in the Circular Economy through Topic Modelling and Hyperparameter Optimisation

arXiv.org Artificial Intelligence

To advance the circular economy (CE), it is crucial to gain insights into the evolution of public sentiments, cognitive pathways of the masses concerning circular products and digital technology, and recognise the primary concerns. To achieve this, we collected data related to the CE from diverse platforms including Twitter, Reddit, and The Guardian. This comprehensive data collection spanned across three distinct strata of the public: the general public, professionals, and official sources. Subsequently, we utilised three topic models on the collected data. Topic modelling represents a type of data-driven and machine learning approach for text mining, capable of automatically categorising a large number of documents into distinct semantic groups. Simultaneously, these groups are described by topics, and these topics can aid in understanding the semantic content of documents at a high level. However, the performance of topic modelling may vary depending on different hyperparameter values. Therefore, in this study, we proposed a framework for topic modelling with hyperparameter optimisation for CE and conducted a series of systematic experiments to ensure that topic models are set with appropriate hyperparameters and to gain insights into the correlations between the CE and public opinion based on well-established models. The results of this study indicate that concerns about sustainability and economic impact persist across all three datasets. Official sources demonstrate a higher level of engagement with the application and regulation of CE. To the best of our knowledge, this study is pioneering in investigating various levels of public opinions concerning CE through topic modelling with the exploration of hyperparameter optimisation.


Generative Design through Quality-Diversity Data Synthesis and Language Models

arXiv.org Artificial Intelligence

Two fundamental challenges face generative models in engineering applications: the acquisition of high-performing, diverse datasets, and the adherence to precise constraints in generated designs. We propose a novel approach combining optimization, constraint satisfaction, and language models to tackle these challenges in architectural design. Our method uses Quality-Diversity (QD) to generate a diverse, high-performing dataset. We then fine-tune a language model with this dataset to generate high-level designs. These designs are then refined into detailed, constraint-compliant layouts using the Wave Function Collapse algorithm. Our system demonstrates reliable adherence to textual guidance, enabling the generation of layouts with targeted architectural and performance features. Crucially, our results indicate that data synthesized through the evolutionary search of QD not only improves overall model performance but is essential for the model's ability to closely adhere to textual guidance. This improvement underscores the pivotal role evolutionary computation can play in creating the datasets key to training generative models for design. Web article at https://tilegpt.github.io


EiG-Search: Generating Edge-Induced Subgraphs for GNN Explanation in Linear Time

arXiv.org Artificial Intelligence

Understanding and explaining the predictions of Graph Neural Networks (GNNs), is crucial for enhancing their safety and trustworthiness. Subgraph-level explanations are gaining attention for their intuitive appeal. However, most existing subgraph-level explainers face efficiency challenges in explaining GNNs due to complex search processes. The key challenge is to find a balance between intuitiveness and efficiency while ensuring transparency. Additionally, these explainers usually induce subgraphs by nodes, which may introduce less-intuitive disconnected nodes in the subgraph-level explanations or omit many important subgraph structures. In this paper, we reveal that inducing subgraph explanations by edges is more comprehensive than other subgraph inducing techniques. We also emphasize the need of determining the subgraph explanation size for each data instance, as different data instances may involve different important substructures. Building upon these considerations, we introduce a training-free approach, named EiG-Search. We employ an efficient linear-time search algorithm over the edge-induced subgraphs, where the edges are ranked by an enhanced gradient-based importance. We conduct extensive experiments on a total of seven datasets, demonstrating its superior performance and efficiency both quantitatively and qualitatively over the leading baselines.


Optimal Text-Based Time-Series Indices

arXiv.org Artificial Intelligence

This integration is typically done by (i) selecting, (ii) transforming, and (iii) aggregating textual content into a time-series representation (see Ardia et al., 2019; Algaba et al., 2020, for a general overview of these steps). While many studies have focused on steps (ii) and (iii)-- transforming and aggregating textual data into a quantitative measure such as sentiment (see e.g., Loughran and McDonald, 2014; Jegadeesh and Wu, 2013; Manela and Moreira, 2017)--the essential selection step (i), which usually relies on subjective ad-hoc rules, has not received much attention yet. We aim to fill this gap in this article by proposing an approach to construct text-based time-series indices optimally. Specifically, our algorithm determines which set of texts, among a large corpus, leads to a text-based index that is optimal for a specific objective--typically, an index that maximizes the contemporaneous relation or the predictive performance with respect to a target variable, such as inflation. Our methodology relies on binary selection matrices that, applied to the vocabulary of tokens, select the relevant texts in the corpus.