Materials
Use of Boosting Algorithms in Household-Level Poverty Measurement: A Machine Learning Approach to Predict and Classify Household Wealth Quintiles in the Philippines
This study assessed the effectiveness of machine learning models in predicting poverty levels in the Philippines using five boosting algorithms: Adaptive Boosting (AdaBoost), CatBoosting (CatBoost), Gradient Boosting Machine (GBM), Light Gradient Boosting Machine (LightGBM), and Extreme Gradient Boosting (XGBoost). CatBoost emerged as the superior model and achieved the highest scores across accuracy, precision, recall, and F1-score at 91 percent, while XGBoost and GBM followed closely with 89 percent and 88 percent respectively. Additionally, the research examined the computational efficiency of these models to analyze the balance between training time, testing speed, and model size factors crucial for real-world applications. Despite its longer training duration, CatBoost demonstrated high testing efficiency. These results indicate that machine learning can aid in poverty prediction and in the development of targeted policy interventions. Future studies should focus on incorporating a wider variety of data to enhance the predictive accuracy and policy utility of these models.
High-dimensional multidisciplinary design optimization for aircraft eco-design / Optimisation multi-disciplinaire en grande dimension pour l'\'eco-conception avion en avant-projet
The objective of this Philosophiae Doctor (Ph.D) thesis is to propose an efficient approach for optimizing a multidisciplinary black-box model when the optimization problem is constrained and involves a large number of mixed integer design variables (typically 100 variables). The targeted optimization approach, called EGO, is based on a sequential enrichment of an adaptive surrogate model and, in this context, GP surrogate models are one of the most widely used in engineering problems to approximate time-consuming high fidelity models. EGO is a heuristic BO method that performs well in terms of solution quality. However, like any other global optimization method, EGO suffers from the curse of dimensionality, meaning that its performance is satisfactory on lower dimensional problems, but deteriorates as the dimensionality of the optimization search space increases. For realistic aircraft design problems, the typical size of the design variables can even exceed 100 and, thus, trying to solve directly the problems using EGO is ruled out. The latter is especially true when the problems involve both continuous and categorical variables increasing even more the size of the search space. In this Ph.D thesis, effective parameterization tools are investigated, including techniques like partial least squares regression, to significantly reduce the number of design variables. Additionally, Bayesian optimization is adapted to handle discrete variables and high-dimensional spaces in order to reduce the number of evaluations when optimizing innovative aircraft concepts such as the "DRAGON" hybrid airplane to reduce their climate impact.
A Study on Unsupervised Anomaly Detection and Defect Localization using Generative Model in Ultrasonic Non-Destructive Testing
Ando, Yusaku, Nakajima, Miya, Saitoh, Takahiro, Kato, Tsuyoshi
In recent years, the deterioration of artificial materials used in structures has become a serious social issue, increasing the importance of inspections. Non-destructive testing is gaining increased demand due to its capability to inspect for defects and deterioration in structures while preserving their functionality. Among these, Laser Ultrasonic Visualization Testing (LUVT) stands out because it allows the visualization of ultrasonic propagation. This makes it visually straightforward to detect defects, thereby enhancing inspection efficiency. With the increasing number of the deterioration structures, challenges such as a shortage of inspectors and increased workload in non-destructive testing have become more apparent. Efforts to address these challenges include exploring automated inspection using machine learning. However, the lack of anomalous data with defects poses a barrier to improving the accuracy of automated inspection through machine learning. Therefore, in this study, we propose a method for automated LUVT inspection using an anomaly detection approach with a diffusion model that can be trained solely on negative examples (defect-free data). We experimentally confirmed that our proposed method improves defect detection and localization compared to general object detection algorithms used previously.
COLT: Towards Completeness-Oriented Tool Retrieval for Large Language Models
Qu, Changle, Dai, Sunhao, Wei, Xiaochi, Cai, Hengyi, Wang, Shuaiqiang, Yin, Dawei, Xu, Jun, Wen, Ji-Rong
Recently, the integration of external tools with Large Language Models (LLMs) has emerged as a promising approach to overcome the inherent constraints of their pre-training data. However, realworld applications often involve a diverse range of tools, making it infeasible to incorporate all tools directly into LLMs due to constraints on input length and response time. Therefore, to fully exploit the potential of tool-augmented LLMs, it is crucial to develop an effective tool retrieval system. Existing tool retrieval methods techniques mainly rely on semantic matching between user queries and tool descriptions, which often results in the selection of redundant tools. As a result, these methods fail to provide a complete set of diverse tools necessary for addressing the multifaceted problems encountered by LLMs. In this paper, we propose a novel modelagnostic COllaborative Learning-based Tool Retrieval approach, COLT, which captures not only the semantic similarities between user queries and tool descriptions but also takes into account the collaborative information of tools. Specifically, we first fine-tune the PLM-based retrieval models to capture the semantic relationships between queries and tools in the semantic learning stage. Subsequently, we construct three bipartite graphs among queries, scenes, and tools and introduce a dual-view graph collaborative learning framework to capture the intricate collaborative relationships among tools during the collaborative learning stage. Extensive experiments on both the open benchmark and the newly introduced ToolLens dataset show that COLT achieves superior performance. Notably, the performance of BERT-mini (11M) with our proposed model framework outperforms BERT-large (340M), which has 30 times more parameters. Additionally, we plan to publicly release the ToolLens dataset to support further research in tool retrieval.
Comparing remote sensing-based forest biomass mapping approaches using new forest inventory plots in contrasting forests in northeastern and southwestern China
Dong, Wenquan, Mitchard, Edward T. A., Chen, Yuwei, Chen, Man, Cao, Congfeng, Hu, Peilun, Xu, Cong, Hancock, Steven
Large-scale high spatial resolution aboveground biomass (AGB) maps play a crucial role in determining forest carbon stocks and how they are changing, which is instrumental in understanding the global carbon cycle, and implementing policy to mitigate climate change. The advent of the new space-borne LiDAR sensor, NASA's GEDI instrument, provides unparalleled possibilities for the accurate and unbiased estimation of forest AGB at high resolution, particularly in dense and tall forests, where Synthetic Aperture Radar (SAR) and passive optical data exhibit saturation. However, GEDI is a sampling instrument, collecting dispersed footprints, and its data must be combined with that from other continuous cover satellites to create high-resolution maps, using local machine learning methods. In this study, we developed local models to estimate forest AGB from GEDI L2A data, as the models used to create GEDI L4 AGB data incorporated minimal field data from China. We then applied LightGBM and random forest regression to generate wall-to-wall AGB maps at 25 m resolution, using extensive GEDI footprints as well as Sentinel-1 data, ALOS-2 PALSAR-2 and Sentinel-2 optical data. Through a 5-fold cross-validation, LightGBM demonstrated a slightly better performance than Random Forest across two contrasting regions. However, in both regions, the computation speed of LightGBM is substantially faster than that of the random forest model, requiring roughly one-third of the time to compute on the same hardware. Through the validation against field data, the 25 m resolution AGB maps generated using the local models developed in this study exhibited higher accuracy compared to the GEDI L4B AGB data. We found in both regions an increase in error as slope increased. The trained models were tested on nearby but different regions and exhibited good performance.
V-Zen: Efficient GUI Understanding and Precise Grounding With A Novel Multimodal LLM
Rahman, Abdur, Chawla, Rajat, Kumar, Muskaan, Datta, Arkajit, Jha, Adarsh, NS, Mukunda, Bhola, Ishaan
In the rapidly evolving landscape of AI research and application, Multimodal Large Language Models (MLLMs) have emerged as a transformative force, adept at interpreting and integrating information from diverse modalities such as text, images, and Graphical User Interfaces (GUIs). Despite these advancements, the nuanced interaction and understanding of GUIs pose a significant challenge, limiting the potential of existing models to enhance automation levels. To bridge this gap, this paper presents V-Zen, an innovative Multimodal Large Language Model (MLLM) meticulously crafted to revolutionise the domain of GUI understanding and grounding. Equipped with dual-resolution image encoders, V-Zen establishes new benchmarks in efficient grounding and next-action prediction, thereby laying the groundwork for self-operating computer systems. Complementing V-Zen is the GUIDE dataset, an extensive collection of real-world GUI elements and task-based sequences, serving as a catalyst for specialised fine-tuning. The successful integration of V-Zen and GUIDE marks the dawn of a new era in multimodal AI research, opening the door to intelligent, autonomous computing experiences. This paper extends an invitation to the research community to join this exciting journey, shaping the future of GUI automation. In the spirit of open science, our code, data, and model will be made publicly available, paving the way for multimodal dialogue scenarios with intricate and precise interactions.
Discovering deposition process regimes: leveraging unsupervised learning for process insights, surrogate modeling, and sensitivity analysis
Suntaxi, Geremy Loachamín, Papavasileiou, Paris, Koronaki, Eleni D., Giovanis, Dimitrios G., Gakis, Georgios, Aviziotis, Ioannis G., Kathrein, Martin, Pozzetti, Gabriele, Czettl, Christoph, Bordas, Stéphane P. A., Boudouvis, Andreas G.
This work introduces a comprehensive approach utilizing data-driven methods to elucidate the deposition process regimes in Chemical Vapor Deposition (CVD) reactors and the interplay of physical mechanism that dominate in each one of them. Through this work, we address three key objectives. Firstly, our methodology relies on process outcomes, derived by a detailed CFD model, to identify clusters of "outcomes" corresponding to distinct process regimes, wherein the relative influence of input variables undergoes notable shifts. This phenomenon is experimentally validated through Arrhenius plot analysis, affirming the efficacy of our approach. Secondly, we demonstrate the development of an efficient surrogate model, based on Polynomial Chaos Expansion (PCE), that maintains accuracy, facilitating streamlined computational analyses. Finally, as a result of PCE, sensitivity analysis is made possible by means of Sobol' indices, that quantify the impact of process inputs across identified regimes. The insights gained from our analysis contribute to the formulation of hypotheses regarding phenomena occurring beyond the transition regime. Notably, the significance of temperature even in the diffusion-limited regime, as evidenced by the Arrhenius plot, suggests activation of gas phase reactions at elevated temperatures. Importantly, our proposed methods yield insights that align with experimental observations and theoretical principles, aiding decision-making in process design and optimization. By circumventing the need for costly and time-consuming experiments, our approach offers a pragmatic pathway towards enhanced process efficiency. Moreover, this study underscores the potential of data-driven computational methods for innovating reactor design paradigms.
Design and fabrication of autonomous electronic lablets for chemical control
McCaskill, John S., Maeke, Thomas, Funke, Dominic, Mayr, Pierre, Sharma, Abhishek, Wagler, Patrick F., Oehm, Jürgen
The programmable investigation and control of chemical systems at the microscale has been an increasingly successful area in microsystem technology for over 25 years including our own work in lab-on-a-chip and microfluidics to approach electronic chemical cells [1-2]. These systems require and are limited by their physical connection (wires, tubes, pipetting) to the macroscopic control system, both for electrical and chemical interfacing. Wireless electronic systems, communicating using radio waves, although already advocated for smart dust [3-4] and implemented down to mm scales, are not yet effective at 100 µm scales and below, especially in aqueous solution where communication is damped, and also do not provide a solution for powering smart microscopic electronic particles in solution. Our approach is a novel and more chemically inspired one [5] - to take advantage of the mobility of microscopic particles which allows their docking to one another pairwise or to a smart microstructured surface (called the dock). It involves fully programmable CMOS electronic particles in contrast to other more restricted approaches such as plasmonic smart dust [6]. Electronic integration using CMOS has been optimized for high speed (GHz range) operation and high integration levels with feature sizes down to 30nm and below. However, for microscopic electronics, extremely low power operation is required (total average power, typically 1 nW for 1000s) by current microscopic charge storage limitations ( 2 µF using supercap technology), which is not consistent either with high frequency operation or the leakage currents associated with the finest scale transistors. Instead, low power operation has been achieved using 180nm technology and an especially designed slow clock [7] and custom transistor design. Electronic actuation of chemical reactions mostly requires switching of voltages on microelectrodes in aqueous solution, which typically have significant capacitances, as exploited in electrolyte capacitors.
Multi-purpose robot for rehabilitation of small diameter water pipes
Feiguel, Julien, NDiaye, Mouhamed, Chambaud, Pascal, Chambellan, Adrien, Blanc, Pierre, Bourgeois, Steve, Labarussiat, Lucas, Dubois, Clemence, Vigneron, Audrey, Desrez, Thomas, Riwan, Alain, Vienne, Caroline
Rehabilitating cast iron pipes through lining offers several advantages, including increased durability, reduced water leaks, and minimal disruption.This approach presents a cost effective and environmentally friendly solution by sealing cracks and joints, extending the pipeline's lifespan, and reducing water wastage, all while avoiding the need for trench excavation. However, due to the relining process, branch connections are sealed and need to be reestablished. To address the issue of rehabilitating small-diameter water pipes, we have designed a modular robot capable of traversing and working within 200 meter long, 100 mm diameter cast iron pipes. This robot is equipped with perception functions to detect, locate, and characterize the branch connections in cast iron pipes and relocate them after lining, as well as machining functions. A first prototype of this system has been developed and validated on an 8 meter long section, in a laboratory environment.
Higher-Rank Irreducible Cartesian Tensors for Equivariant Message Passing
Zaverkin, Viktor, Alesiani, Francesco, Maruyama, Takashi, Errica, Federico, Christiansen, Henrik, Takamoto, Makoto, Weber, Nicolas, Niepert, Mathias
The ability to perform fast and accurate atomistic simulations is crucial for advancing the chemical sciences. By learning from high-quality data, machine-learned interatomic potentials achieve accuracy on par with ab initio and first-principles methods at a fraction of their computational cost. The success of machine-learned interatomic potentials arises from integrating inductive biases such as equivariance to group actions on an atomic system, e.g., equivariance to rotations and reflections. In particular, the field has notably advanced with the emergence of equivariant message-passing architectures. Most of these models represent an atomic system using spherical tensors, tensor products of which require complicated numerical coefficients and can be computationally demanding. This work introduces higher-rank irreducible Cartesian tensors as an alternative to spherical tensors, addressing the above limitations. We integrate irreducible Cartesian tensor products into message-passing neural networks and prove the equivariance of the resulting layers. Through empirical evaluations on various benchmark data sets, we consistently observe on-par or better performance than that of state-of-the-art spherical models.