Materials
Symmetry Adapted Residual Neural Network Diabatization: Conical Intersections in Aniline Photodissociation
We present a symmetry adapted residual neural network (SAResNet) diabatization method to construct quasi-diabatic Hamiltonians that accurately represent ab initio adiabatic energies, energy gradients, and nonadiabatic couplings for moderate sized systems. Our symmetry adapted neural network inherits from the pioneering symmetry adapted polynomial and fundamental invariant neural network diabatization methods to exploit the power of neural network along with the transparent symmetry adaptation of polynomial for both symmetric and asymmetric irreducible representations. In addition, our symmetry adaptation provides a unified framework for symmetry adapted polynomial and symmetry adapted neural network, enabling the adoption of the residual neural network architecture, which is a powerful descendant of the pioneering feedforward neural network. Our SAResNet is applied to construct the full 36-dimensional coupled diabatic potential energy surfaces for aniline N-H bond photodissociation, with 2,269 data points and 32,640 trainable parameters and 190 cm-1 root mean square deviation in energy. In addition to the experimentally observed {\pi}{\pi}* and {\pi}Rydberg/{\pi}{\sigma}* states, a higher state (HOMO - 1 {\pi} to Rydberg/{\sigma}* excitation) is found to introduce an induced geometric phase effect thus indirectly participate in the photodissociation process.
Performance Evaluation of Deep Learning Models for Water Quality Index Prediction: A Comparative Study of LSTM, TCN, ANN, and MLP
Ismail, Muhammad, Abbas, Farkhanda, Shah, Shahid Munir, Aljawarneh, Mahmoud, Dhomeja, Lachhman Das, Abbas, Fazila, Shoaib, Muhammad, Alrefaei, Abdulwahed Fahad, Albeshr, Mohammed Fahad
Increased population, urbanization, adoption of modern life styles, and congested population structures pose problems of sewage disposal and pollution of surface waters like lakes. Natural water gets polluted because of weathering of rocks, seepage of soils, and mining processes, etc. [1]. Water quality assessment is used to assess the quality of water based on multiple parameters such as temperature, electrical conductivity, nitrate, phosphorus, potassium, dissolved oxygen, etc. Water Quality Index (WQI) aggregates data from these parameters and produces a single numer that is helpful for the water quality assessment [2]. It facilitates a thorough judgment of water conditions in an environment and directs resource management strategies along with the appropriate treatment plan for it [3-5]. Traditionally, WQI is estimated using different mathematical procedures [6], however, recently, Machine Learning (ML) methods are used for its more feasible and costeffective estimation [7]. Because of their robust nature to handle complex data patterns, these methods have become a viable paradigm of improved predictions.
What Features in Prompts Jailbreak LLMs? Investigating the Mechanisms Behind Attacks
Kirch, Nathalie Maria, Field, Severin, Casper, Stephen
While `jailbreaks' have been central to research on the safety and reliability of LLMs (large language models), the underlying mechanisms behind these attacks are not well understood. Some prior works have used linear methods to analyze jailbreak prompts or model refusal. Here, however, we compare linear and nonlinear methods to study the features in prompts that contribute to successful jailbreaks. We do this by probing for jailbreak success based only on the portions of the latent representations corresponding to prompt tokens. First, we introduce a dataset of 10,800 jailbreak attempts from 35 attack methods. We then show that different jailbreaking methods work via different nonlinear features in prompts. Specifically, we find that while probes can distinguish between successful and unsuccessful jailbreaking prompts with a high degree of accuracy, they often transfer poorly to held-out attack methods. We also show that nonlinear probes can be used to mechanistically jailbreak the LLM by guiding the design of adversarial latent perturbations. These mechanistic jailbreaks are able to jailbreak Gemma-7B-IT more reliably than 34 of the 35 techniques that it was trained on. Ultimately, our results suggest that jailbreaks cannot be thoroughly understood in terms of universal or linear prompt features alone.
MolCap-Arena: A Comprehensive Captioning Benchmark on Language-Enhanced Molecular Property Prediction
Edwards, Carl, Lu, Ziqing, Hajiramezanali, Ehsan, Biancalani, Tommaso, Ji, Heng, Scalia, Gabriele
Bridging biomolecular modeling with natural language information, particularly through large language models (LLMs), has recently emerged as a promising interdisciplinary research area. LLMs, having been trained on large corpora of scientific documents, demonstrate significant potential in understanding and reasoning about biomolecules by providing enriched contextual and domain knowledge. However, the extent to which LLM-driven insights can improve performance on complex predictive tasks (e.g., toxicity) remains unclear. Further, the extent to which relevant knowledge can be extracted from LLMs also remains unknown. In this study, we present Molecule Caption Arena: the first comprehensive benchmark of LLM-augmented molecular property prediction. We evaluate over twenty LLMs, including both general-purpose and domain-specific molecule captioners, across diverse prediction tasks. To this goal, we introduce a novel, battle-based rating system. Our findings confirm the ability of LLM-extracted knowledge to enhance state-of-the-art molecular representations, with notable model-, prompt-, and dataset-specific variations. Code, resources, and data are available at github.com/Genentech/molcap-arena.
Biohybrid Microrobots Based on Jellyfish Stinging Capsules and Janus Particles for In Vitro Deep-Tissue Drug Penetration
Park, Sinwook, Barak, Noga, Lotan, Tamar, Yossifon, Gilad
Microrobots engineered from self-propelling active particles, extend the reach of robotic operations to submillimeter dimensions and are becoming increasingly relevant for various tasks, such as manipulation of micro/nanoscale cargo, particularly targeted drug delivery. However, achieving deep-tissue penetration and drug delivery remain a challenge. This work developed a novel biohybrid microrobot consisting of jellyfish stinging capsules, which act as natural nanoinjectors for efficient penetration and delivery, assembled onto an active Janus particle (JP). While microrobot transport and navigation was externally controlled by magnetic field-induced rolling, capsule loading onto the JP surface was controlled by electric field. Following precise navigation of the biohybrid microrobots to the vicinity of target tissues, the capsules were activated by a specific enzyme introduced to the solution, which then triggered tubule ejection and release of the preloaded molecules. Use of such microrobots for penetration of and delivery of the preloaded drug/toxin to targeted cancer spheroids and live Caenorhabditis elegans was demonstrated in-vitro. The findings offer insights for future development of bio-inspired microrobots capable of deep penetration and drug delivery. Future directions may involve encapsulation of various drugs within different capsule types for enhanced versatility. This study may also inspire in-vivo applications involving deep tissue drug delivery.
Expert-level protocol translation for self-driving labs
Shi, Yu-Zhe, Meng, Fanxu, Hou, Haofei, Bi, Zhangqian, Xu, Qiao, Ruan, Lecheng, Wang, Qining
Recent development in Artificial Intelligence (AI) models has propelled their application in scientific discovery, but the validation and exploration of these discoveries require subsequent empirical experimentation. The concept of self-driving laboratories promises to automate and thus boost the experimental process following AI-driven discoveries. However, the transition of experimental protocols, originally crafted for human comprehension, into formats interpretable by machines presents significant challenges, which, within the context of specific expert domain, encompass the necessity for structured as opposed to natural language, the imperative for explicit rather than tacit knowledge, and the preservation of causality and consistency throughout protocol steps. Presently, the task of protocol translation predominantly requires the manual and labor-intensive involvement of domain experts and information technology specialists, rendering the process time-intensive. To address these issues, we propose a framework that automates the protocol translation process through a three-stage workflow, which incrementally constructs Protocol Dependence Graphs (PDGs) that approach structured on the syntax level, completed on the semantics level, and linked on the execution level. Quantitative and qualitative evaluations have demonstrated its performance at par with that of human experts, underscoring its potential to significantly expedite and democratize the process of scientific discovery by elevating the automation capabilities within self-driving laboratories.
Explainable few-shot learning workflow for detecting invasive and exotic tree species
Gevaert, Caroline M., Pedro, Alexandra Aguiar, Ku, Ou, Cheng, Hao, Chandramouli, Pranav, Javan, Farzaneh Dadrass, Nattino, Francesco, Georgievska, Sonja
Deep Learning methods are notorious for relying on extensive labeled datasets to train and assess their performance. This can cause difficulties in practical situations where models should be trained for new applications for which very little data is available. While few-shot learning algorithms can address the first problem, they still lack sufficient explanations for the results. This research presents a workflow that tackles both challenges by proposing an explainable few-shot learning workflow for detecting invasive and exotic tree species in the Atlantic Forest of Brazil using Unmanned Aerial Vehicle (UAV) images. By integrating a Siamese network with explainable AI (XAI), the workflow enables the classification of tree species with minimal labeled data while providing visual, case-based explanations for the predictions. Results demonstrate the effectiveness of the proposed workflow in identifying new tree species, even in data-scarce conditions. With a lightweight backbone, e.g., MobileNet, it achieves a F1-score of 0.86 in 3-shot learning, outperforming a shallow CNN. A set of explanation metrics, i.e., correctness, continuity, and contrastivity, accompanied by visual cases, provide further insights about the prediction results. This approach opens new avenues for using AI and UAVs in forest management and biodiversity conservation, particularly concerning rare or under-studied species.
Scalable AI Framework for Defect Detection in Metal Additive Manufacturing
Phan, Duy Nhat, Jha, Sushant, Mavo, James P., Lanigan, Erin L., Nguyen, Linh, Poudel, Lokendra, Bhowmik, Rahul
Additive Manufacturing (AM) is transforming the manufacturing sector by enabling efficient production of intricately designed products and small-batch components. However, metal parts produced via AM can include flaws that cause inferior mechanical properties, including reduced fatigue response, yield strength, and fracture toughness. To address this issue, we leverage convolutional neural networks (CNN) to analyze thermal images of printed layers, automatically identifying anomalies that impact these properties. We also investigate various synthetic data generation techniques to address limited and imbalanced AM training data. Our models' defect detection capabilities were assessed using images of Nickel alloy 718 layers produced on a laser powder bed fusion AM machine and synthetic datasets with and without added noise. Our results show significant accuracy improvements with synthetic data, emphasizing the importance of expanding training sets for reliable defect detection. Specifically, Generative Adversarial Networks (GAN)-generated datasets streamlined data preparation by eliminating human intervention while maintaining high performance, thereby enhancing defect detection capabilities. Additionally, our denoising approach effectively improves image quality, ensuring reliable defect detection. Finally, our work integrates these models in the CLoud ADditive MAnufacturing (CLADMA) module, a user-friendly interface, to enhance their accessibility and practicality for AM applications. This integration supports broader adoption and practical implementation of advanced defect detection in AM processes.
A KAN-based Interpretable Framework for Process-Informed Prediction of Global Warming Potential
Lee, Jaewook, Sun, Xinyang, Errington, Ethan, Guo, Miao
Accurate prediction of Global Warming Potential (GWP) is essential for assessing the environmental impact of chemical processes and materials. Traditional GWP prediction models rely predominantly on molecular structure, overlooking critical process-related information. In this study, we present an integrative GWP prediction model that combines molecular descriptors (MACCS keys and Mordred descriptors) with process information (process title, description, and location) to improve predictive accuracy and interpretability. Using a deep neural network (DNN) model, we achieved an R-squared of 86% on test data with Mordred descriptors, process location, and description information, representing a 25% improvement over the previous benchmark of 61%; XAI analysis further highlighted the significant role of process title embeddings in enhancing model predictions. To enhance interpretability, we employed a Kolmogorov-Arnold Network (KAN) to derive a symbolic formula for GWP prediction, capturing key molecular and process features and providing a transparent, interpretable alternative to black-box models, enabling users to gain insights into the molecular and process factors influencing GWP. Error analysis showed that the model performs reliably in densely populated data ranges, with increased uncertainty for higher GWP values. This analysis allows users to manage prediction uncertainty effectively, supporting data-driven decision-making in chemical and process design. Our results suggest that integrating both molecular and process-level information in GWP prediction models yields substantial gains in accuracy and interpretability, offering a valuable tool for sustainability assessments. Future work may extend this approach to additional environmental impact categories and refine the model to further enhance its predictive reliability.
Reinforcement Learning Gradients as Vitamin for Online Finetuning Decision Transformers
Yan, Kai, Schwing, Alexander G., Wang, Yu-Xiong
Decision Transformers have recently emerged as a new and compelling paradigm for offline Reinforcement Learning (RL), completing a trajectory in an autoregressive way. While improvements have been made to overcome initial shortcomings, online finetuning of decision transformers has been surprisingly under-explored. The widely adopted state-of-the-art Online Decision Transformer (ODT) still struggles when pretrained with low-reward offline data. In this paper, we theoretically analyze the online-finetuning of the decision transformer, showing that the commonly used Return-To-Go (RTG) that's far from the expected return hampers the online fine-tuning process. This problem, however, is well-addressed by the value function and advantage of standard RL algorithms. As suggested by our analysis, in our experiments, we hence find that simply adding TD3 gradients to the finetuning process of ODT effectively improves the online finetuning performance of ODT, especially if ODT is pretrained with low-reward offline data. These findings provide new directions to further improve decision transformers.