Wang, Xiaonan
MoMa: A Modular Deep Learning Framework for Material Property Prediction
Wang, Botian, Ouyang, Yawen, Li, Yaohui, Wang, Yiqun, Cui, Haorui, Zhang, Jianbing, Wang, Xiaonan, Ma, Wei-Ying, Zhou, Hao
Deep learning methods for material property prediction have been widely explored to advance materials discovery. However, the prevailing pre-train then fine-tune paradigm often fails to address the inherent diversity and disparity of material tasks. To overcome these challenges, we introduce MoMa, a Modular framework for Materials that first trains specialized modules across a wide range of tasks and then adaptively composes synergistic modules tailored to each downstream scenario. Evaluation across 17 datasets demonstrates the superiority of MoMa, with a substantial 14% average improvement over the strongest baseline. Few-shot and continual learning experiments further highlight MoMa's potential for real-world applications. Pioneering a new paradigm of modular material learning, MoMa will be open-sourced to foster broader community collaboration.
ErgoChat: a Visual Query System for the Ergonomic Risk Assessment of Construction Workers
Fan, Chao, Mei, Qipei, Wang, Xiaonan, Li, Xinming
In the construction sector, workers often endure prolonged periods of high-intensity physical work and prolonged use of tools, resulting in injuries and illnesses primarily linked to postural ergonomic risks, a longstanding predominant health concern. To mitigate these risks, researchers have applied various technological methods to identify the ergonomic risks that construction workers face. However, traditional ergonomic risk assessment (ERA) techniques do not offer interactive feedback. The rapidly developing vision-language models (VLMs), capable of generating textual descriptions or answering questions about ergonomic risks based on image inputs, have not yet received widespread attention. This research introduces an interactive visual query system tailored to assess the postural ergonomic risks of construction workers. The system's capabilities include visual question answering (VQA), which responds to visual queries regarding workers' exposure to postural ergonomic risks, and image captioning (IC), which generates textual descriptions of these risks from images. Additionally, this study proposes a dataset designed for training and testing such methodologies. Systematic testing indicates that the VQA functionality delivers an accuracy of 96.5%. Moreover, evaluations using nine metrics for IC and assessments from human experts indicate that the proposed approach surpasses the performance of a method using the same architecture trained solely on generic datasets. This study sets a new direction for future developments in interactive ERA using generative artificial intelligence (AI) technologies. Keywords: Generative Artificial Intelligence; Vision-Language Model; Large language model; Ergonomic Risk Assessment; Construction Safety 1 Introduction Prompt and effective identification and mitigation of workplace hazards are essential for maintaining safety, health, and productivity within the work environment. In the construction industry, workers are often subject to conditions that require awkward body postures, repetitive motions, and intense physical effort, which can detrimentally impact their health [1]. Such conditions in construction tasks usually lead to the emergence of work-related musculoskeletal disorders (WMSDs). Statistics from the United States Bureau of Labor Statistics show that the construction industry's injuries and illnesses caused by WMSDs ranked fifth among all industries. Moreover, in the same year, WMSDs represented 30% of all occupational injuries and illnesses [1]. According to the Association of Workers' Compensation Boards of Canada, the manufacturing and construction sectors reported the second and third-highest rates of losttime injury claims in 2021, representing 13.6% and 10.4% of claims, respectively [2]. European Agency for Safety and Health at Work indicated that the construction and manufacturing sectors reported the highest sick leave rates due to WMSDs [3].
KULTURE Bench: A Benchmark for Assessing Language Model in Korean Cultural Context
Wang, Xiaonan, Yeo, Jinyoung, Lim, Joon-Ho, Kim, Hansaem
Large language models have exhibited significant enhancements in performance across various tasks. However, the complexity of their evaluation increases as these models generate more fluent and coherent content. Current multilingual benchmarks often use translated English versions, which may incorporate Western cultural biases that do not accurately assess other languages and cultures. To address this research gap, we introduce KULTURE Bench, an evaluation framework specifically designed for Korean culture that features datasets of cultural news, idioms, and poetry. It is designed to assess language models' cultural comprehension and reasoning capabilities at the word, sentence, and paragraph levels. Using the KULTURE Bench, we assessed the capabilities of models trained with different language corpora and analyzed the results comprehensively. The results show that there is still significant room for improvement in the models' understanding of texts related to the deeper aspects of Korean culture.
Integrating Chemistry Knowledge in Large Language Models via Prompt Engineering
Liu, Hongxuan, Yin, Haoyu, Luo, Zhiyao, Wang, Xiaonan
This paper presents a study on the integration of domain-specific knowledge in prompt engineering to enhance the performance of large language models (LLMs) in scientific domains. A benchmark dataset is curated to encapsulate the intricate physical-chemical properties of small molecules, their drugability for pharmacology, alongside the functional attributes of enzymes and crystal materials, underscoring the relevance and applicability across biological and chemical domains.The proposed domain-knowledge embedded prompt engineering method outperforms traditional prompt engineering strategies on various metrics, including capability, accuracy, F1 score, and hallucination drop. The effectiveness of the method is demonstrated through case studies on complex materials including the MacMillan catalyst, paclitaxel, and lithium cobalt oxide. The results suggest that domain-knowledge prompts can guide LLMs to generate more accurate and relevant responses, highlighting the potential of LLMs as powerful tools for scientific discovery and innovation when equipped with domain-specific prompts. The study also discusses limitations and future directions for domain-specific prompt engineering development.
An Artificial Intelligence (AI) workflow for catalyst design and optimization
Lai, Nung Siong, Tew, Yi Shen, Zhong, Xialin, Yin, Jun, Li, Jiali, Yan, Binhang, Wang, Xiaonan
In the pursuit of novel catalyst development to address pressing environmental concerns and energy demand, conventional design and optimization methods often fall short due to the complexity and vastness of the catalyst parameter space. The advent of Machine Learning (ML) has ushered in a new era in the field of catalyst optimization, offering potential solutions to the shortcomings of traditional techniques. However, existing methods fail to effectively harness the wealth of information contained within the burgeoning body of scientific literature on catalyst synthesis. To address this gap, this study proposes an innovative Artificial Intelligence (AI) workflow that integrates Large Language Models (LLMs), Bayesian optimization, and an active learning loop to expedite and enhance catalyst optimization. Our methodology combines advanced language understanding with robust optimization strategies, effectively translating knowledge extracted from diverse literature into actionable parameters for practical experimentation and optimization. In this article, we demonstrate the application of this AI workflow in the optimization of catalyst synthesis for ammonia production. The results underscore the workflow's ability to streamline the catalyst development process, offering a swift, resource-efficient, and highprecision alternative to conventional methods. Keywords: Catalysts; Large Language Models; Active Learning; Bayesian Optimization; Ammonia Synthesis 1. Introduction The development of novel catalysts to address increasing energy demand and consumption has become an urgent task in the realm of renewable energy This surge is driven not only by escalating demands from applications in process optimization, yield improvement, and energy saving but also by a heightened awareness and concern for environmental issues, particularly the increase in carbon dioxide emissions. Several optimization strategies are conventionally employed to identify the optimal set of condition parameters, thereby enhancing the performance of the catalyst. The'One Factor At a Time' (OFAT) method is frequently employed as an alternative technique for chemical process optimization and comprehension While these conventional optimization methods and their advancements have undeniably made significant contributions to the field, certain gaps persist that limit their full potential in optimizing catalyst synthesis. The predominant reliance on the empirical knowledge and intuition of seasoned chemists, while invaluable, is not systematically scalable and transferable. Techniques like OFAT and DoE, though statistically rigorous, are often unable to keep pace with the sheer complexity and vastness of the catalyst parameter space, leaving much of it unexplored and underutilized.
A Heterogeneous Parallel Non-von Neumann Architecture System for Accurate and Efficient Machine Learning Molecular Dynamics
Zhao, Zhuoying, Tan, Ziling, Mo, Pinghui, Wang, Xiaonan, Zhao, Dan, Zhang, Xin, Tao, Ming, Liu, Jie
This paper proposes a special-purpose system to achieve high-accuracy and high-efficiency machine learning (ML) molecular dynamics (MD) calculations. The system consists of field programmable gate array (FPGA) and application specific integrated circuit (ASIC) working in heterogeneous parallelization. To be specific, a multiplication-less neural network (NN) is deployed on the non-von Neumann (NvN)-based ASIC (SilTerra 180 nm process) to evaluate atomic forces, which is the most computationally expensive part of MD. All other calculations of MD are done using FPGA (Xilinx XC7Z100). It is shown that, to achieve similar-level accuracy, the proposed NvN-based system based on low-end fabrication technologies (180 nm) is 1.6x faster and 10^2-10^3x more energy efficiency than state-of-the-art vN based MLMD using graphics processing units (GPUs) based on much more advanced technologies (12 nm), indicating superiority of the proposed NvN-based heterogeneous parallel architecture.