Materials
Energy Rank Alignment: Using Preference Optimization to Search Chemical Space at Scale
Chennakesavalu, Shriram, Hu, Frank, Ibarraran, Sebastian, Rotskoff, Grant M.
Searching through chemical space is an exceptionally challenging problem because the number of possible molecules grows combinatorially with the number of atoms. Large, autoregressive models trained on databases of chemical compounds have yielded powerful generators, but we still lack robust strategies for generating molecules with desired properties. This molecular search problem closely resembles the "alignment" problem for large language models, though for many chemical tasks we have a specific and easily evaluable reward function. Here, we introduce an algorithm called energy rank alignment (ERA) that leverages an explicit reward function to produce a gradient-based objective that we use to optimize autoregressive policies. We show theoretically that this algorithm is closely related to proximal policy optimization (PPO) and direct preference optimization (DPO), but has a minimizer that converges to an ideal Gibbs-Boltzmann distribution with the reward playing the role of an energy function. Furthermore, this algorithm is highly scalable, does not require reinforcement learning, and performs well relative to DPO when the number of preference observations per pairing is small. We deploy this approach to align molecular transformers to generate molecules with externally specified properties and find that it does so robustly, searching through diverse parts of chemical space. While our focus here is on chemical search, we also obtain excellent results on an AI supervised task for LLM alignment, showing that the method is scalable and general.
GPT-4 Jailbreaks Itself with Near-Perfect Success Using Self-Explanation
Ramesh, Govind, Dou, Yao, Xu, Wei
Research on jailbreaking has been valuable for testing and understanding the safety and security issues of large language models (LLMs). In this paper, we introduce Iterative Refinement Induced Self-Jailbreak (IRIS), a novel approach that leverages the reflective capabilities of LLMs for jailbreaking with only black-box access. Unlike previous methods, IRIS simplifies the jailbreaking process by using a single model as both the attacker and target. This method first iteratively refines adversarial prompts through self-explanation, which is crucial for ensuring that even well-aligned LLMs obey adversarial instructions. IRIS then rates and enhances the output given the refined prompt to increase its harmfulness. We find IRIS achieves jailbreak success rates of 98% on GPT-4 and 92% on GPT-4 Turbo in under 7 queries. It significantly outperforms prior approaches in automatic, black-box and interpretable jailbreaking, while requiring substantially fewer queries, thereby establishing a new standard for interpretable jailbreaking methods.
Inverse Design of Metal-Organic Frameworks Using Quantum Natural Language Processing
In this study, we explore the potential of using quantum natural language processing (QNLP) to inverse design metal-organic frameworks (MOFs) with targeted properties. Specifically, by analyzing 150 hypothetical MOF structures consisting of 10 metal nodes and 15 organic ligands, we categorize these structures into four distinct classes for pore volume and $H_{2}$ uptake values. We then compare various QNLP models (i.e. the bag-of-words, DisCoCat (Distributional Compositional Categorical), and sequence-based models) to identify the most effective approach to process the MOF dataset. Using a classical simulator provided by the IBM Qiskit, the bag-of-words model is identified to be the optimum model, achieving validation accuracies of 85.7% and 86.7% for binary classification tasks on pore volume and $H_{2}$ uptake, respectively. Further, we developed multi-class classification models tailored to the probabilistic nature of quantum circuits, with average test accuracies of 88.4% and 80.7% across different classes for pore volume and $H_{2}$ uptake datasets. Finally, the performance of generating MOF with target properties showed accuracies of 93.5% for pore volume and 89% for $H_{2}$ uptake, respectively. Although our investigation covers only a fraction of the vast MOF search space, it marks a promising first step towards using quantum computing for materials design, offering a new perspective through which to explore the complex landscape of MOFs.
Revolutionizing Process Mining: A Novel Architecture for ChatGPT Integration and Enhanced User Experience through Optimized Prompt Engineering
Kermani, Mehrdad Agha Mohammad Ali, Seddighi, Hamid Reza, Maghsoudi, Mehrdad
In the rapidly evolving field of business process management, there is a growing need for analytical tools that can transform complex data into actionable insights. This research introduces a novel approach by integrating Large Language Models (LLMs), such as ChatGPT, into process mining tools, making process analytics more accessible to a wider audience. The study aims to investigate how ChatGPT enhances analytical capabilities, improves user experience, increases accessibility, and optimizes the architectural frameworks of process mining tools. The key innovation of this research lies in developing a tailored prompt engineering strategy for each process mining submodule, ensuring that the AI-generated outputs are accurate and relevant to the context. The integration architecture follows an Extract, Transform, Load (ETL) process, which includes various process mining engine modules and utilizes zero-shot and optimized prompt engineering techniques. ChatGPT is connected via APIs and receives structured outputs from the process mining modules, enabling conversational interactions. To validate the effectiveness of this approach, the researchers used data from 17 companies that employ BehfaLab's Process Mining Tool. The results showed significant improvements in user experience, with an expert panel rating 72% of the results as "Good". This research contributes to the advancement of business process analysis methodologies by combining process mining with artificial intelligence. Future research directions include further optimization of prompt engineering, exploration of integration with other AI technologies, and assessment of scalability across various business environments. This study paves the way for continuous innovation at the intersection of process mining and artificial intelligence, promising to revolutionize the way businesses analyze and optimize their processes.
Navigating Public Sentiment in the Circular Economy through Topic Modelling and Hyperparameter Optimisation
Song, Junhao, Yuan, Yingfang, Chang, Kaiwen, Xu, Bing, Xuan, Jin, Pang, Wei
To advance the circular economy (CE), it is crucial to gain insights into the evolution of public sentiments, cognitive pathways of the masses concerning circular products and digital technology, and recognise the primary concerns. To achieve this, we collected data related to the CE from diverse platforms including Twitter, Reddit, and The Guardian. This comprehensive data collection spanned across three distinct strata of the public: the general public, professionals, and official sources. Subsequently, we utilised three topic models on the collected data. Topic modelling represents a type of data-driven and machine learning approach for text mining, capable of automatically categorising a large number of documents into distinct semantic groups. Simultaneously, these groups are described by topics, and these topics can aid in understanding the semantic content of documents at a high level. However, the performance of topic modelling may vary depending on different hyperparameter values. Therefore, in this study, we proposed a framework for topic modelling with hyperparameter optimisation for CE and conducted a series of systematic experiments to ensure that topic models are set with appropriate hyperparameters and to gain insights into the correlations between the CE and public opinion based on well-established models. The results of this study indicate that concerns about sustainability and economic impact persist across all three datasets. Official sources demonstrate a higher level of engagement with the application and regulation of CE. To the best of our knowledge, this study is pioneering in investigating various levels of public opinions concerning CE through topic modelling with the exploration of hyperparameter optimisation.
Implementing a GRU Neural Network for Flood Prediction in Ashland City, Tennessee
Fordjour, George K., Kalyanapu, Alfred J.
Ashland City, Tennessee, located within the Lower Cumberland Sycamore watershed, is highly susceptible to flooding due to increased upstream water levels. This study aimed to develop a robust flood prediction model for the city, utilizing water level data at 30-minute intervals from ten USGS gauge stations within the watershed. A Gated Recurrent Unit (GRU) network, known for its ability to effectively process sequential time-series data, was used. The model was trained, validated, and tested using a year-long dataset (January 2021-January 2022), and its performance was evaluated using statistical metrics including Nash-Sutcliffe Efficiency (NSE), Root Mean Squared Error (RMSE), Percent Bias (PBIAS), Mean Absolute Error (MAE), and Coefficient of Determination (R^2). The results demonstrated a high level of accuracy, with the model explaining 98.2% of the variance in the data. Despite minor discrepancies between predicted and observed values, the GRU model proved to be an effective tool for flood prediction in Ashland City, with potential applications for enhancing disaster preparedness and response efforts in Ashland City.
A Gaussian Process Model for Ordinal Data with Applications to Chemoinformatics
Gosnell, Arron, Evangelou, Evangelos
With the proliferation of screening tools for chemical testing, it is now possible to create vast databases of chemicals easily. However, rigorous statistical methodologies employed to analyse these databases are in their infancy, and further development to facilitate chemical discovery is imperative. In this paper, we present conditional Gaussian process models to predict ordinal outcomes from chemical experiments, where the inputs are chemical compounds. We implement the Tanimoto distance, a metric on the chemical space, within the covariance of the Gaussian processes to capture correlated effects in the chemical space. A novel aspect of our model is that the kernel contains a scaling parameter, a feature not previously examined in the literature, that controls the strength of the correlation between elements of the chemical space. Using molecular fingerprints, a numerical representation of a compound's location within the chemical space, we show that accounting for correlation amongst chemical compounds improves predictive performance over the uncorrelated model, where effects are assumed to be independent. Moreover, we present a genetic algorithm for the facilitation of chemical discovery and identification of important features to the compound's efficacy. A simulation study is conducted to demonstrate the suitability of the proposed methods. Our proposed methods are demonstrated on a hazard classification problem of organic solvents.
UniCorn: A Unified Contrastive Learning Approach for Multi-view Molecular Representation Learning
Feng, Shikun, Ni, Yuyan, Li, Minghao, Huang, Yanwen, Ma, Zhi-Ming, Ma, Wei-Ying, Lan, Yanyan
Recently, a noticeable trend has emerged in developing pre-trained foundation models in the domains of CV and NLP. However, for molecular pre-training, there lacks a universal model capable of effectively applying to various categories of molecular tasks, since existing prevalent pre-training methods exhibit effectiveness for specific types of downstream tasks. Furthermore, the lack of profound understanding of existing pre-training methods, including 2D graph masking, 2D-3D contrastive learning, and 3D denoising, hampers the advancement of molecular foundation models. In this work, we provide a unified comprehension of existing pre-training methods through the lens of contrastive learning. Thus their distinctions lie in clustering different views of molecules, which is shown beneficial to specific downstream tasks. To achieve a complete and general-purpose molecular representation, we propose a novel pre-training framework, named UniCorn, that inherits the merits of the three methods, depicting molecular views in three different levels. SOTA performance across quantum, physicochemical, and biological tasks, along with comprehensive ablation study, validate the universality and effectiveness of UniCorn.
Deep Learning in Earthquake Engineering: A Comprehensive Review
This article surveys the growing interest in utilizing Deep Learning (DL) as a powerful tool to address challenging problems in earthquake engineering. Despite decades of advancement in domain knowledge, issues such as uncertainty in earthquake occurrence, unpredictable seismic loads, nonlinear structural responses, and community engagement remain difficult to tackle using domain-specific methods. DL offers promising solutions by leveraging its data-driven capacity for nonlinear mapping, sequential data modeling, automatic feature extraction, dimensionality reduction, optimal decision-making, etc. However, the literature lacks a comprehensive review that systematically covers a consistent scope intersecting DL and earthquake engineering. To bridge the gap, the article first discusses methodological advances to elucidate various applicable DL techniques, such as multi-layer perceptron (MLP), convolutional neural network (CNN), recurrent neural network (RNN), generative adversarial network (GAN), autoencoder (AE), transfer learning (TL), reinforcement learning (RL), and graph neural network (GNN). A thorough research landscape is then disclosed by exploring various DL applications across different research topics, including vision-based seismic damage assessment and structural characterization, seismic demand and damage state prediction, seismic response history prediction, regional seismic risk assessment and community resilience, ground motion (GM) for engineering use, seismic response control, and the inverse problem of system/damage identification. Suitable DL techniques for each research topic are identified, emphasizing the preeminence of CNN for vision-based tasks, RNN for sequential data, RL for community resilience, and unsupervised learning for GM analysis. The article also discusses opportunities and challenges for leveraging DL in earthquake engineering research and practice, highlighting the need for open-access multimodal big data and efforts to enhance model interpretability and incorporate physics information into DL. Finally, the paper advocates for DL applications to further advance the research frontier of uncertainty quantification in performance-based earthquake engineering.
Response Matching for generating materials and molecules
Machine learning has recently emerged as a powerful tool for generating new molecular and material structures. The success of state-of-the-art models stems from their ability to incorporate physical symmetries, such as translation, rotation, and periodicity. Here, we present a novel generative method called Response Matching (RM), which leverages the fact that each stable material or molecule exists at the minimum of its potential energy surface. Consequently, any perturbation induces a response in energy and stress, driving the structure back to equilibrium. Matching to such response is closely related to score matching in diffusion models. By employing the combination of a machine learning interatomic potential and random structure search as the denoising model, RM exploits the locality of atomic interactions, and inherently respects permutation, translation, rotation, and periodic invariances. RM is the first model to handle both molecules and bulk materials under the same framework. We demonstrate the efficiency and generalization of RM across three systems: a small organic molecular dataset, stable crystals from the Materials Project, and one-shot learning on a single diamond configuration.