Materials
Approximated Orthogonal Projection Unit: Stabilizing Regression Network Training Using Natural Gradient
Wang, Shaoqi, Yang, Chunjie, Lou, Siwei
Neural networks (NN) are extensively studied in cutting-edge soft sensor models due to their feature extraction and function approximation capabilities. Current research into network-based methods primarily focuses on models' offline accuracy. Notably, in industrial soft sensor context, online optimizing stability and interpretability are prioritized, followed by accuracy. This requires a clearer understanding of network's training process. To bridge this gap, we propose a novel NN named the Approximated Orthogonal Projection Unit (AOPU) which has solid mathematical basis and presents superior training stability. AOPU truncates the gradient backpropagation at dual parameters, optimizes the trackable parameters updates, and enhances the robustness of training. We further prove that AOPU attains minimum variance estimation (MVE) in NN, wherein the truncated gradient approximates the natural gradient (NG). Empirical results on two chemical process datasets clearly show that AOPU outperforms other models in achieving stable convergence, marking a significant advancement in soft sensor field.
Enhancing Multivariate Time Series-based Solar Flare Prediction with Multifaceted Preprocessing and Contrastive Learning
EskandariNasab, MohammadReza, Hamdi, Shah Muhammad, Boubrahimi, Soukaina Filali
Accurate solar flare prediction is crucial due to the significant risks that intense solar flares pose to astronauts, space equipment, and satellite communication systems. Our research enhances solar flare prediction by utilizing advanced data preprocessing and classification methods on a multivariate time series-based dataset of photospheric magnetic field parameters. First, our study employs a novel preprocessing pipeline that includes missing value imputation, normalization, balanced sampling, near decision boundary sample removal, and feature selection to significantly boost prediction accuracy. Second, we integrate contrastive learning with a GRU regression model to develop a novel classifier, termed ContReg, which employs dual learning methodologies, thereby further enhancing prediction performance. To validate the effectiveness of our preprocessing pipeline, we compare and demonstrate the performance gain of each step, and to demonstrate the efficacy of the ContReg classifier, we compare its performance to that of sequence-based deep learning architectures, machine learning models, and findings from previous studies. Our results illustrate exceptional True Skill Statistic (TSS) scores, surpassing previous methods and highlighting the critical role of precise data preprocessing and classifier development in time series-based solar flare prediction.
Learning Ordering in Crystalline Materials with Symmetry-Aware Graph Neural Networks
Peng, Jiayu, Damewood, James, Karaguesian, Jessica, Lunger, Jaclyn R., Gรณmez-Bombarelli, Rafael
Graph convolutional neural networks (GCNNs) have become a machine learning workhorse for screening the chemical space of crystalline materials in fields such as catalysis and energy storage, by predicting properties from structures. Multicomponent materials, however, present a unique challenge since they can exhibit chemical (dis)order, where a given lattice structure can encompass a variety of elemental arrangements ranging from highly ordered structures to fully disordered solid solutions. Critically, properties like stability, strength, and catalytic performance depend not only on structures but also on orderings. To enable rigorous materials design, it is thus critical to ensure GCNNs are capable of distinguishing among atomic orderings. However, the ordering-aware capability of GCNNs has been poorly understood. Here, we benchmark various neural network architectures for capturing the ordering-dependent energetics of multicomponent materials in a custom-made dataset generated with high-throughput atomistic simulations. Conventional symmetry-invariant GCNNs were found unable to discern the structural difference between the diverse symmetrically inequivalent atomic orderings of the same material, while symmetry-equivariant model architectures could inherently preserve and differentiate the distinct crystallographic symmetries of various orderings.
ChemEval: A Comprehensive Multi-Level Chemical Evaluation for Large Language Models
Huang, Yuqing, Zhang, Rongyang, He, Xuesong, Zhi, Xuyang, Wang, Hao, Li, Xin, Xu, Feiyang, Liu, Deguang, Liang, Huadong, Li, Yi, Cui, Jian, Liu, Zimu, Wang, Shijin, Hu, Guoping, Liu, Guiquan, Liu, Qi, Lian, Defu, Chen, Enhong
There is a growing interest in the role that LLMs play in chemistry which lead to an increased focus on the development of LLMs benchmarks tailored to chemical domains to assess the performance of LLMs across a spectrum of chemical tasks varying in type and complexity. However, existing benchmarks in this domain fail to adequately meet the specific requirements of chemical research professionals. To this end, we propose \textbf{\textit{ChemEval}}, which provides a comprehensive assessment of the capabilities of LLMs across a wide range of chemical domain tasks. Specifically, ChemEval identified 4 crucial progressive levels in chemistry, assessing 12 dimensions of LLMs across 42 distinct chemical tasks which are informed by open-source data and the data meticulously crafted by chemical experts, ensuring that the tasks have practical value and can effectively evaluate the capabilities of LLMs. In the experiment, we evaluate 12 mainstream LLMs on ChemEval under zero-shot and few-shot learning contexts, which included carefully selected demonstration examples and carefully designed prompts. The results show that while general LLMs like GPT-4 and Claude-3.5 excel in literature understanding and instruction following, they fall short in tasks demanding advanced chemical knowledge. Conversely, specialized LLMs exhibit enhanced chemical competencies, albeit with reduced literary comprehension. This suggests that LLMs have significant potential for enhancement when tackling sophisticated tasks in the field of chemistry. We believe our work will facilitate the exploration of their potential to drive progress in chemistry. Our benchmark and analysis will be available at {\color{blue} \url{https://github.com/USTC-StarTeam/ChemEval}}.
Morphological Detection and Classification of Microplastics and Nanoplastics Emerged from Consumer Products by Deep Learning
Rezvani, Hadi, Zarrabi, Navid, Mehta, Ishaan, Kolios, Christopher, Jaafar, Hussein Ali, Kao, Cheng-Hao, Saeedi, Sajad, Yousefi, Nariman
For example, a U-Net [31] model can be used for some studies have utilized manually annotated images for deep semantic segmentation, and a Convolutional Neural Network learning applications involving microplastics, their datasets are (CNN) can then classify the segmented pixels, as demonstrated not publicly accessible [22], [23], [25]. Notably, there is only in [22], [24]. It is also possible to perform instance segmentation one other open-source Scanning Electron Microscopy (SEM) directly from the start. For instance, a Mask R-CNN dataset on microplastics, presented in [24], which categorizes model can simultaneously identify regions of interest, classify particles by shape (e.g., fragments, fibers, and beads) and each detected object, and generate a mask for each instance, features a more limited size distribution. These contributions as shown by [23]. Additionally, Faster R-CNN, primarily used not only address the urgent environmental issue of microplastic for object detection, has been applied to microscopic images to contamination but also set a new benchmark for detecting and classify microplastics into two polymer types [25]. Given the analyzing microplastics in aquatic environments, paving the nature of our dataset, where overlapping and crowded MNPs way for future innovations in the field.
RainbowSight: A Family of Generalizable, Curved, Camera-Based Tactile Sensors For Shape Reconstruction
Tippur, Megha H., Adelson, Edward H.
Camera-based tactile sensors can provide high resolution positional and local geometry information for robotic manipulation. Curved and rounded fingers are often advantageous, but it can be difficult to derive illumination systems that work well within curved geometries. To address this issue, we introduce RainbowSight, a family of curved, compact, camera-based tactile sensors which use addressable RGB LEDs illuminated in a novel rainbow spectrum pattern. In addition to being able to scale the illumination scheme to different sensor sizes and shapes to fit on a variety of end effector configurations, the sensors can be easily manufactured and require minimal optical tuning to obtain high resolution depth reconstructions of an object deforming the sensor's soft elastomer surface. Additionally, we show the advantages of our new hardware design and improvements in calibration methods for accurate depth map generation when compared to alternative lighting methods commonly implemented in previous camera-based tactile sensors. With these advancements, we make the integration of tactile sensors more accessible to roboticists by allowing them the flexibility to easily customize, fabricate, and calibrate camera-based tactile sensors to best fit the needs of their robotic systems.
A Deep Learning Approach for Pixel-level Material Classification via Hyperspectral Imaging
Sifnaios, Savvas, Arvanitakis, George, Konstantinidis, Fotios K., Tsimiklis, Georgios, Amditis, Angelos, Frangos, Panayiotis
Recent advancements in computer vision, particularly in detection, segmentation, and classification, have significantly impacted various domains. However, these advancements are tied to RGB-based systems, which are insufficient for applications in industries like waste sorting, pharmaceuticals, and defense, where advanced object characterization beyond shape or color is necessary. Hyperspectral (HS) imaging, capturing both spectral and spatial information, addresses these limitations and offers advantages over conventional technologies such as X-ray fluorescence and Raman spectroscopy, particularly in terms of speed, cost, and safety. This study evaluates the potential of combining HS imaging with deep learning for material characterization. The research involves: i) designing an experimental setup with HS camera, conveyor, and controlled lighting; ii) generating a multi-object dataset of various plastics (HDPE, PET, PP, PS) with semi-automated mask generation and Raman spectroscopy-based labeling; and iii) developing a deep learning model trained on HS images for pixel-level material classification. The model achieved 99.94\% classification accuracy, demonstrating robustness in color, size, and shape invariance, and effectively handling material overlap. Limitations, such as challenges with black objects, are also discussed. Extending computer vision beyond RGB to HS imaging proves feasible, overcoming major limitations of traditional methods and showing strong potential for future applications.
Hydrogen under Pressure as a Benchmark for Machine-Learning Interatomic Potentials
Bischoff, Thomas, Jรคckl, Bastian, Rupp, Matthias
Machine-learning interatomic potentials (MLPs) are fast, data-driven surrogate models of atomistic systems' potential energy surfaces that can accelerate ab-initio molecular dynamics (MD) simulations by several orders of magnitude. The performance of MLPs is commonly measured as the prediction error in energies and forces on data not used in their training. While low prediction errors on a test set are necessary, they do not guarantee good performance in MD simulations. The latter requires physically motivated performance measures obtained from running accelerated simulations. However, the adoption of such measures has been limited by the effort and domain knowledge required to calculate and interpret them. To overcome this limitation, we present a benchmark that automatically quantifies the performance of MLPs in MD simulations of a liquid-liquid phase transition in hydrogen under pressure, a challenging benchmark system. The benchmark's h-llpt-24 dataset provides reference geometries, energies, forces, and stresses from density functional theory MD simulations at different temperatures and mass densities. The benchmark's Python code automatically runs MLP-accelerated MD simulations and calculates, quantitatively compares and visualizes pressures, stable molecular fractions, diffusion coefficients, and radial distribution functions. Employing this benchmark, we show that several state-of-the-art MLPs fail to reproduce the liquid-liquid phase transition.
LLMs Still Can't Plan; Can LRMs? A Preliminary Evaluation of OpenAI's o1 on PlanBench
Valmeekam, Karthik, Stechly, Kaya, Kambhampati, Subbarao
The ability to plan a course of action that achieves a desired state of affairs has long been considered a core competence of intelligent agents and has been an integral part of AI research since its inception. With the advent of large language models (LLMs), there has been considerable interest in the question of whether or not they possess such planning abilities. PlanBench, an extensible benchmark we developed in 2022, soon after the release of GPT3, has remained an important tool for evaluating the planning abilities of LLMs. Despite the slew of new private and open source LLMs since GPT3, progress on this benchmark has been surprisingly slow. OpenAI claims that their recent o1 (Strawberry) model has been specifically constructed and trained to escape the normal limitations of autoregressive LLMs--making it a new kind of model: a Large Reasoning Model (LRM). Using this development as a catalyst, this paper takes a comprehensive look at how well current LLMs and new LRMs do on PlanBench. As we shall see, while o1's performance is a quantum improvement on the benchmark, outpacing the competition, it is still far from saturating it. This improvement also brings to the fore questions about accuracy, efficiency, and guarantees which must be considered before deploying such systems.
A generalizable framework for unlocking missing reactions in genome-scale metabolic networks using deep learning
Liu, Xiaoyi, Yang, Hongpeng, Ai, Chengwei, Dong, Ruihan, Ding, Yijie, Yuan, Qianqian, Tang, Jijun, Guo, Fei
Incomplete knowledge of metabolic processes hinders the accuracy of GEnome-scale Metabolic models (GEMs), which in turn impedes advancements in systems biology and metabolic engineering. Existing gap-filling methods typically rely on phenotypic data to minimize the disparity between computational predictions and experimental results. However, there is still a lack of an automatic and precise gap-filling method for initial state GEMs before experimental data and annotated genomes become available. In this study, we introduce CLOSEgaps, a deep learning-driven tool that addresses the gap-filling issue by modeling it as a hyperedge prediction problem within GEMs. Specifically, CLOSEgaps maps metabolic networks as hypergraphs and learns their hyper-topology features to identify missing reactions and gaps by leveraging hypothetical reactions. This innovative approach allows for the characterization and curation of both known and hypothetical reactions within metabolic networks. Extensive results demonstrate that CLOSEgaps accurately gap-filling over 96% of artificially introduced gaps for various GEMs. Furthermore, CLOSEgaps enhances phenotypic predictions for 24 GEMs and also finds a notable improvement in producing four crucial metabolites (Lactate, Ethanol, Propionate, and Succinate) in two organisms. As a broadly applicable solution for any GEM, CLOSEgaps represents a promising model to automate the gap-filling process and uncover missing connections between reactions and observed metabolic phenotypes.