Goto

Collaborating Authors

 Feng, Nan


Predicting performance-related properties of refrigerant based on tailored small-molecule functional group contribution

arXiv.org Artificial Intelligence

As current group contribution (GC) methods are mostly proposed for a wide size-range of molecules, applying them to property prediction of small refrigerant molecules could lead to unacceptable errors. In this sense, for the design of novel refrigerants and refrigeration systems, tailoring GC-based models specifically fitted to refrigerant molecules is of great interest. In this work, databases of potential refrigerant molecules are first collected, focusing on five key properties related to the operational efficiency of refrigeration systems, namely normal boiling point, critical temperature, critical pressure, enthalpy of vaporization, and acentric factor. Based on tailored small-molecule groups, the GC method is combined with machine learning (ML) to model these performance-related properties. Following the development of GC-ML models, their performance is analyzed to highlight the potential group-to-property contributions. Additionally, the refrigerant property databases are extended internally and externally, based on which examples are presented to highlight the significance of the developed models.


Dr.E Bridges Graphs with Large Language Models through Words

arXiv.org Artificial Intelligence

Significant efforts have been directed toward integrating powerful Large Language Models (LLMs) with diverse modalities, particularly focusing on the fusion of vision, language, and audio data. However, the graph-structured data, inherently rich in structural and domain-specific knowledge, have not yet been gracefully adapted to LLMs. Existing methods either describe the graph with raw text, suffering the loss of graph structural information, or feed Graph Neural Network (GNN) embeddings directly into LLM at the cost of losing semantic representation. To bridge this gap, we introduce an innovative, end-to-end modality-aligning framework, equipped with a pretrained Dual-Residual Vector Quantized-Variational AutoEncoder (Dr.E). This framework is specifically designed to facilitate token-level alignment with LLMs, enabling an effective translation of the intrinsic `language' of graphs into comprehensible natural language. Our experimental evaluations on standard GNN node classification tasks demonstrate competitive performance against other state-of-the-art approaches. Additionally, our framework ensures interpretability, efficiency, and robustness, with its effectiveness further validated under both fine-tuning and few-shot settings. This study marks the first successful endeavor to achieve token-level alignment between GNNs and LLMs.