Materials
Stoichiometry Representation Learning with Polymorphic Crystal Structures
Lee, Namkyeong, Noh, Heewoong, Na, Gyoung S., Fu, Tianfan, Sun, Jimeng, Park, Chanyoung
Despite the recent success of machine learning (ML) in materials science, its success heavily relies on the structural description of crystal, which is itself computationally demanding and occasionally unattainable. Stoichiometry descriptors can be an alternative approach, which reveals the ratio between elements involved to form a certain compound without any structural information. However, it is not trivial to learn the representations of stoichiometry due to the nature of materials science called polymorphism, i.e., a single stoichiometry can exist in multiple structural forms due to the flexibility of atomic arrangements, inducing uncertainties in representation. To this end, we propose PolySRL, which learns the probabilistic representation of stoichiometry by utilizing the readily available structural information, whose uncertainty reveals the polymorphic structures of stoichiometry. Extensive experiments on sixteen datasets demonstrate the superiority of PolySRL, and analysis of uncertainties shed light on the applicability of PolySRL in real-world material discovery.
Equivariant Neural Operator Learning with Graphon Convolution
We propose a general architecture that combines the coefficient learning scheme with a residual operator layer for learning mappings between continuous functions in the 3D Euclidean space. Our proposed model is guaranteed to achieve SE(3)-equivariance by design. From the graph spectrum view, our method can be interpreted as convolution on graphons (dense graphs with infinitely many nodes), which we term InfGCN. By leveraging both the continuous graphon structure and the discrete graph structure of the input data, our model can effectively capture the geometric information while preserving equivariance. Through extensive experiments on large-scale electron density datasets, we observed that our model significantly outperformed the current state-of-the-art architectures. Multiple ablation studies were also carried out to demonstrate the effectiveness of the proposed architecture.
Efficient End-to-End Visual Document Understanding with Rationale Distillation
Zhu, Wang, Agarwal, Alekh, Joshi, Mandar, Jia, Robin, Thomason, Jesse, Toutanova, Kristina
Understanding visually situated language requires recognizing text and visual elements, and interpreting complex layouts. State-of-the-art methods commonly use specialized pre-processing tools, such as optical character recognition (OCR) systems, that map document image inputs to extracted information in the space of textual tokens, and sometimes also employ large language models (LLMs) to reason in text token space. However, the gains from external tools and LLMs come at the cost of increased computational and engineering complexity. In this paper, we ask whether small pretrained image-to-text models can learn selective text or layout recognition and reasoning as an intermediate inference step in an end-to-end model for pixel-level visual language understanding. We incorporate the outputs of such OCR tools, LLMs, and larger multimodal models as intermediate ``rationales'' on training data, and train a small student model to predict both rationales and answers for input questions based on those training examples. A student model based on Pix2Struct (282M parameters) achieves consistent improvements on three visual document understanding benchmarks representing infographics, scanned documents, and figures, with improvements of more than 4\% absolute over a comparable Pix2Struct model that predicts answers directly.
Rethinking Fano's Inequality in Ensemble Learning
Morishita, Terufumi, Morio, Gaku, Horiguchi, Shota, Ozaki, Hiroaki, Nukaga, Nobuo
The central question of ensemble learning has been: what factors make an ensemble system good or bad? It has We propose a fundamental theory on ensemble been widely believed that accurate and diverse models lead learning that answers the central question: what to better performance for ensemble systems. Guided by factors make an ensemble system good or bad? this intuition, many heuristical metrics have been proposed Previous studies used a variant of Fano's inequality to measure accuracy and diversity (Kohavi et al., 1996; of information theory and derived a lower Skalak et al., 1996; Cunningham & Carney, 2000; Shipp bound of the classification error rate on the basis & Kuncheva, 2002). However, these metrics lack theoretical of the accuracy and diversity of models. We grounding, and indeed, Kuncheva & Whitaker (2003) revisit the original Fano's inequality and argue empirically showed that there are no connections between that the studies did not take into account the information the metrics and system performance through a broad range lost when multiple model predictions of experiments. Turning to theoretical viewpoints, Geman are combined into a final prediction. To address et al. (1992) decomposed the squared error loss used in regression this issue, we generalize the previous theory to tasks into the bias and covariance of models. Bias incorporate the information loss, which we name here corresponds to accuracy and covariance diversity.
Accelerating material discovery with a threshold-driven hybrid acquisition policy-based Bayesian optimization
Raihan, Ahmed Shoyeb, Khosravi, Hamed, Das, Srinjoy, Ahmed, Imtiaz
Advancements in materials play a crucial role in technological progress. However, the process of discovering and developing materials with desired properties is often impeded by substantial experimental costs, extensive resource utilization, and lengthy development periods. To address these challenges, modern approaches often employ machine learning (ML) techniques such as Bayesian Optimization (BO), which streamline the search for optimal materials by iteratively selecting experiments that are most likely to yield beneficial results. However, traditional BO methods, while beneficial, often struggle with balancing the trade-off between exploration and exploitation, leading to sub-optimal performance in material discovery processes. This paper introduces a novel Threshold-Driven UCB-EI Bayesian Optimization (TDUE-BO) method, which dynamically integrates the strengths of Upper Confidence Bound (UCB) and Expected Improvement (EI) acquisition functions to optimize the material discovery process. Unlike the classical BO, our method focuses on efficiently navigating the high-dimensional material design space (MDS). TDUE-BO begins with an exploration-focused UCB approach, ensuring a comprehensive initial sweep of the MDS. As the model gains confidence, indicated by reduced uncertainty, it transitions to the more exploitative EI method, focusing on promising areas identified earlier. The UCB-to-EI switching policy dictated guided through continuous monitoring of the model uncertainty during each step of sequential sampling results in navigating through the MDS more efficiently while ensuring rapid convergence. The effectiveness of TDUE-BO is demonstrated through its application on three different material datasets, showing significantly better approximation and optimization performance over the EI and UCB-based BO methods in terms of the RMSE scores and convergence efficiency, respectively.
Physics-Enhanced Multi-fidelity Learning for Optical Surface Imprint
Human fingerprints serve as one unique and powerful characteristic for each person, from which policemen can recognize the identity. Similar to humans, many natural bodies and intrinsic mechanical qualities can also be uniquely identified from surface characteristics. To measure the elasto-plastic properties of one material, one formally sharp indenter is pushed into the measured body under constant force and retracted, leaving a unique residual imprint of the minute size from several micrometers to nanometers. However, one great challenge is how to map the optical image of this residual imprint into the real wanted mechanical properties, i.e., the tensile force curve. In this paper, we propose a novel method to use multi-fidelity neural networks (MFNN) to solve this inverse problem. We first actively train the NN model via pure simulation data, and then bridge the sim-to-real gap via transfer learning. The most innovative part is that we use NN to dig out the unknown physics and also implant the known physics into the transfer learning framework, thus highly improving the model stability and decreasing the data requirement. This work serves as one great example of applying machine learning into the real experimental research, especially under the constraints of data limitation and fidelity variance.
GistScore: Learning Better Representations for In-Context Example Selection with Gist Bottlenecks
Gupta, Shivanshu, Rosenbaum, Clemens, Elenberg, Ethan R.
Large language models (LLMs) have the ability to perform in-context learning (ICL) of new tasks by conditioning on prompts comprising a few task examples. This work studies the problem of selecting the best examples given a candidate pool to improve ICL performance on given a test input. Existing approaches either require training with feedback from a much larger LLM or are computationally expensive. We propose a novel metric, GistScore, based on Example Gisting, a novel approach for training example retrievers for ICL using an attention bottleneck via Gisting, a recent technique for compressing task instructions. To tradeoff performance with ease of use, we experiment with both fine-tuning gist models on each dataset and multi-task training a single model on a large collection of datasets. On 21 diverse datasets spanning 9 tasks, we show that our fine-tuned models get state-of-the-art ICL performance with 20% absolute average gain over off-the-shelf retrievers and 7% over the best prior methods. Our multi-task model generalizes well out-of-the-box to new task categories, datasets, and prompt templates with retrieval speeds that are consistently thousands of times faster than the best prior training-free method.
Identifying the Key Attributes in an Unlabeled Event Log for Automated Process Discovery
Toyoda, Kentaroh, Ying, Rachel Gan Kai, Zhang, Allan NengSheng, Siew, Tan Puay
Process mining discovers and analyzes a process model from historical event logs. The prior art methods use the key attributes of case-id, activity, and timestamp hidden in an event log as clues to discover a process model. However, a user needs to specify them manually, and this can be an exhaustive task. In this paper, we propose a two-stage key attribute identification method to avoid such a manual investigation, and thus this is a step toward fully automated process discovery. One of the challenging tasks is how to avoid exhaustive computation due to combinatorial explosion. For this, we narrow down candidates for each key attribute by using supervised machine learning in the first stage and identify the best combination of the key attributes by discovering process models and evaluating them in the second stage. Our computational complexity can be reduced from $\mathcal{O}(N^3)$ to $\mathcal{O}(k^3)$ where $N$ and $k$ are the numbers of columns and candidates we keep in the first stage, respectively, and usually $k$ is much smaller than $N$. We evaluated our method with 14 open datasets and showed that our method could identify the key attributes even with $k = 2$ for about 20 seconds for many datasets.
Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems
Zhang, Xuan, Wang, Limei, Helwig, Jacob, Luo, Youzhi, Fu, Cong, Xie, Yaochen, Liu, Meng, Lin, Yuchao, Xu, Zhao, Yan, Keqiang, Adams, Keir, Weiler, Maurice, Li, Xiner, Fu, Tianfan, Wang, Yucheng, Yu, Haiyang, Xie, YuQing, Fu, Xiang, Strasser, Alex, Xu, Shenglong, Liu, Yi, Du, Yuanqi, Saxton, Alexandra, Ling, Hongyi, Lawrence, Hannah, Stärk, Hannes, Gui, Shurui, Edwards, Carl, Gao, Nicholas, Ladera, Adriana, Wu, Tailin, Hofgard, Elyssa F., Tehrani, Aria Mansouri, Wang, Rui, Daigavane, Ameya, Bohde, Montgomery, Kurtin, Jerry, Huang, Qian, Phung, Tuong, Xu, Minkai, Joshi, Chaitanya K., Mathis, Simon V., Azizzadenesheli, Kamyar, Fang, Ada, Aspuru-Guzik, Alán, Bekkers, Erik, Bronstein, Michael, Zitnik, Marinka, Anandkumar, Anima, Ermon, Stefano, Liò, Pietro, Yu, Rose, Günnemann, Stephan, Leskovec, Jure, Ji, Heng, Sun, Jimeng, Barzilay, Regina, Jaakkola, Tommi, Coley, Connor W., Qian, Xiaoning, Qian, Xiaofeng, Smidt, Tess, Ji, Shuiwang
Advances in artificial intelligence (AI) are fueling a new paradigm of discoveries in natural sciences. Today, AI has started to advance natural sciences by improving, accelerating, and enabling our understanding of natural phenomena at a wide range of spatial and temporal scales, giving rise to a new area of research known as AI for science (AI4Science). Being an emerging research paradigm, AI4Science is unique in that it is an enormous and highly interdisciplinary area. Thus, a unified and technical treatment of this field is needed yet challenging. This work aims to provide a technically thorough account of a subarea of AI4Science; namely, AI for quantum, atomistic, and continuum systems. These areas aim at understanding the physical world from the subatomic (wavefunctions and electron density), atomic (molecules, proteins, materials, and interactions), to macro (fluids, climate, and subsurface) scales and form an important subarea of AI4Science. A unique advantage of focusing on these areas is that they largely share a common set of challenges, thereby allowing a unified and foundational treatment. A key common challenge is how to capture physics first principles, especially symmetries, in natural systems by deep learning methods. We provide an in-depth yet intuitive account of techniques to achieve equivariance to symmetry transformations. We also discuss other common technical challenges, including explainability, out-of-distribution generalization, knowledge transfer with foundation and large language models, and uncertainty quantification. To facilitate learning and education, we provide categorized lists of resources that we found to be useful. We strive to be thorough and unified and hope this initial effort may trigger more community interests and efforts to further advance AI4Science.
Enhancing Emergency Decision-making with Knowledge Graphs and Large Language Models
Chen, Minze, Tao, Zhenxiang, Tang, Weitong, Qin, Tingxin, Yang, Rui, Zhu, Chunli
Emergency management urgently requires comprehensive knowledge while having a high possibility to go beyond individuals' cognitive scope. Therefore, artificial intelligence(AI) supported decision-making under that circumstance is of vital importance. Recent emerging large language models (LLM) provide a new direction for enhancing targeted machine intelligence. However, the utilization of LLM directly would inevitably introduce unreliable output for its inherent issue of hallucination and poor reasoning skills. In this work, we develop a system called Enhancing Emergency decision-making with Knowledge Graph and LLM (E-KELL), which provides evidence-based decision-making in various emergency stages. The study constructs a structured emergency knowledge graph and guides LLMs to reason over it via a prompt chain. In real-world evaluations, E-KELL receives scores of 9.06, 9.09, 9.03, and 9.09 in comprehensibility, accuracy, conciseness, and instructiveness from a group of emergency commanders and firefighters, demonstrating a significant improvement across various situations compared to baseline models. This work introduces a novel approach to providing reliable emergency decision support.