Liu, Siyuan
Beyond Atoms: Enhancing Molecular Pretrained Representations with 3D Space Modeling
Lu, Shuqi, Ji, Xiaohong, Zhang, Bohang, Yao, Lin, Liu, Siyuan, Gao, Zhifeng, Zhang, Linfeng, Ke, Guolin
Molecular pretrained representations (MPR) has emerged as a powerful approach for addressing the challenge of limited supervised data in applications such as drug discovery and material design. While early MPR methods relied on 1D sequences and 2D graphs, recent advancements have incorporated 3D conformational information to capture rich atomic interactions. However, these prior models treat molecules merely as discrete atom sets, overlooking the space surrounding them. We argue from a physical perspective that only modeling these discrete points is insufficient. We first present a simple yet insightful observation: naively adding randomly sampled virtual points beyond atoms can surprisingly enhance MPR performance. In light of this, we propose a principled framework that incorporates the entire 3D space spanned by molecules. We implement the framework via a novel Transformer-based architecture, dubbed SpaceFormer, with three key components: (1) grid-based space discretization; (2) grid sampling/merging; and (3) efficient 3D positional encoding. Extensive experiments show that SpaceFormer significantly outperforms previous 3D MPR models across various downstream tasks with limited data, validating the benefit of leveraging the additional 3D space beyond atoms in MPR models.
ReLEP: A Novel Framework for Real-world Long-horizon Embodied Planning
Liu, Siyuan, Du, Jiawei, Xiang, Sicheng, Wang, Zibo, Luo, Dingsheng
Real-world long-horizon embodied planning underpins embodied AI. To accomplish long-horizon tasks, agents need to decompose abstract instructions into detailed steps. Prior works mostly rely on GPT-4V for task decomposition into predefined actions, which limits task diversity due to GPT-4V's finite understanding of larger skillsets. Therefore, we present ReLEP, a groundbreaking framework for Real world Long-horizon Embodied Planning, which can accomplish a wide range of daily tasks. At its core lies a fine-tuned large vision language model that formulates plans as sequences of skill functions according to input instruction and scene image. These functions are selected from a carefully designed skill library. ReLEP is also equipped with a Memory module for plan and status recall, and a Robot Configuration module for versatility across robot types. In addition, we propose a semi-automatic data generation pipeline to tackle dataset scarcity. Real-world off-line experiments across eight daily embodied tasks demonstrate that ReLEP is able to accomplish long-horizon embodied tasks and outperforms other state-of-the-art baseline methods.
pFLFE: Cross-silo Personalized Federated Learning via Feature Enhancement on Medical Image Segmentation
Xie, Luyuan, Lin, Manqing, Liu, Siyuan, Xu, ChenMing, Luan, Tianyu, Li, Cong, Fang, Yuejian, Shen, Qingni, Wu, Zhonghai
In medical image segmentation, personalized cross-silo federated learning (FL) is becoming popular for utilizing varied data across healthcare settings to overcome data scarcity and privacy concerns. However, existing methods often suffer from client drift, leading to inconsistent performance and delayed training. We propose a new framework, Personalized Federated Learning via Feature Enhancement (pFLFE), designed to mitigate these challenges. pFLFE consists of two main stages: feature enhancement and supervised learning. The first stage improves differentiation between foreground and background features, and the second uses these enhanced features for learning from segmentation masks. We also design an alternative training approach that requires fewer communication rounds without compromising segmentation quality, even with limited communication resources. Through experiments on three medical segmentation tasks, we demonstrate that pFLFE outperforms the state-of-the-art methods.
Self-Consistency Training for Density-Functional-Theory Hamiltonian Prediction
Zhang, He, Liu, Chang, Wang, Zun, Wei, Xinran, Liu, Siyuan, Zheng, Nanning, Shao, Bin, Liu, Tie-Yan
Predicting the mean-field Hamiltonian matrix in density functional theory is a fundamental formulation to leverage machine learning for solving molecular science problems. Yet, its applicability is limited by insufficient labeled data for training. In this work, we highlight that Hamiltonian prediction possesses a self-consistency principle, based on which we propose self-consistency training, an exact training method that does not require labeled data. It distinguishes the task from predicting other molecular properties by the following benefits: (1) it enables the model to be trained on a large amount of unlabeled data, hence addresses the data scarcity challenge and enhances generalization; (2) it is more efficient than running DFT to generate labels for supervised training, since it amortizes DFT calculation over a set of queries. We empirically demonstrate the better generalization in data-scarce and out-of-distribution scenarios, and the better efficiency over DFT labeling. These benefits push forward the applicability of Hamiltonian prediction to an ever-larger scale.
End-to-End Crystal Structure Prediction from Powder X-Ray Diffraction
Lai, Qingsi, Yao, Lin, Gao, Zhifeng, Liu, Siyuan, Wang, Hongshuai, Lu, Shuqi, He, Di, Wang, Liwei, Wang, Cheng, Ke, Guolin
Powder X-ray diffraction (PXRD) is a crucial means for crystal structure determination. Such determination often involves external database matching to find a structural analogue and Rietveld refinement to obtain finer structure. However, databases may be incomplete and Rietveld refinement often requires intensive trial-and-error efforts from trained experimentalists, which remains ineffective in practice. To settle these issues, we propose XtalNet, the first end-to-end deep learning-based framework capable of ab initio generation of crystal structures that accurately match given PXRD patterns. The model employs contrastive learning and Diffusion-based conditional generation to enable the simultaneous execution of two tasks: crystal structure retrieval based on PXRD patterns and conditional structure generations. To validate the effectiveness of XtalNet, we curate a much more challenging and practical dataset hMOF-100, XtalNet performs well on this dataset, reaching 96.3\% top-10 hit ratio on the database retrieval task and 95.0\% top-10 match rate on the ranked structure generation task.
Reconstruction of Cortical Surfaces with Spherical Topology from Infant Brain MRI via Recurrent Deformation Learning
Chen, Xiaoyang, Zhao, Junjie, Liu, Siyuan, Ahmad, Sahar, Yap, Pew-Thian
Cortical surface reconstruction (CSR) from MRI is key to investigating brain structure and function. While recent deep learning approaches have significantly improved the speed of CSR, a substantial amount of runtime is still needed to map the cortex to a topologically-correct spherical manifold to facilitate downstream geometric analyses. Moreover, this mapping is possible only if the topology of the surface mesh is homotopic to a sphere. Here, we present a method for simultaneous CSR and spherical mapping efficiently within seconds. Our approach seamlessly connects two sub-networks for white and pial surface generation. Residual diffeomorphic deformations are learned iteratively to gradually warp a spherical template mesh to the white and pial surfaces while preserving mesh topology and uniformity. The one-to-one vertex correspondence between the template sphere and the cortical surfaces allows easy and direct mapping of geometric features like convexity and curvature to the sphere for visualization and downstream processing. We demonstrate the efficacy of our approach on infant brain MRI, which poses significant challenges to CSR due to tissue contrast changes associated with rapid brain development during the first postnatal year. Performance evaluation based on a dataset of infants from 0 to 12 months demonstrates that our method substantially enhances mesh regularity and reduces geometric errors, outperforming state-of-the-art deep learning approaches, all while maintaining high computational efficiency.
M-OFDFT: Overcoming the Barrier of Orbital-Free Density Functional Theory for Molecular Systems Using Deep Learning
Zhang, He, Liu, Siyuan, You, Jiacheng, Liu, Chang, Zheng, Shuxin, Lu, Ziheng, Wang, Tong, Zheng, Nanning, Shao, Bin
Orbital-free density functional theory (OFDFT) is a quantum chemistry formulation that has a lower cost scaling than the prevailing Kohn-Sham DFT, which is increasingly desired for contemporary molecular research. However, its accuracy is limited by the kinetic energy density functional, which is notoriously hard to approximate for non-periodic molecular systems. In this work, we propose M-OFDFT, an OFDFT approach capable of solving molecular systems using a deep-learning functional model. We build the essential nonlocality into the model, which is made affordable by the concise density representation as expansion coefficients under an atomic basis. With techniques to address unconventional learning challenges therein, M-OFDFT achieves a comparable accuracy with Kohn-Sham DFT on a wide range of molecules untouched by OFDFT before. More attractively, M-OFDFT extrapolates well to molecules much larger than those in training, which unleashes the appealing scaling for studying large molecules including proteins, representing an advancement of the accuracy-efficiency trade-off frontier in quantum chemistry.
Controller Synthesis of Collaborative Signal Temporal Logic Tasks for Multi-Agent Systems via Assume-Guarantee Contracts
Liu, Siyuan, Saoud, Adnane, Dimarogonas, Dimos V.
This paper considers the problem of controller synthesis of signal temporal logic (STL) specifications for large-scale multi-agent systems, where the agents are dynamically coupled and subject to collaborative tasks. A compositional framework based on continuous-time assume-guarantee contracts is developed to break the complex and large synthesis problem into subproblems of manageable sizes. We first show how to formulate the collaborative STL tasks as assume-guarantee contracts by leveraging the idea of funnel-based control. The concept of contracts is used to establish our compositionality result, which allows us to guarantee the satisfaction of a global contract by the multi-agent system when all agents satisfy their local contracts. Then, a closed-form continuous-time feedback controller is designed to enforce local contracts over the agents in a distributed manner, which further guarantees the global task satisfaction based on the compositionality result. Finally, the effectiveness of our results is demonstrated by two numerical examples.
Addressing Class Variable Imbalance in Federated Semi-supervised Learning
Dong, Zehui, Liu, Wenjing, Liu, Siyuan, Chen, Xingzhi
Federated Semi-supervised Learning (FSSL) combines techniques from both fields of federated and semi-supervised learning to improve the accuracy and performance of models in a distributed environment by using a small fraction of labeled data and a large amount of unlabeled data. Without the need to centralize all data in one place for training, it collect updates of model training after devices train models at local, and thus can protect the privacy of user data. However, during the federal training process, some of the devices fail to collect enough data for local training, while new devices will be included to the group training. This leads to an unbalanced global data distribution and thus affect the performance of the global model training. Most of the current research is focusing on class imbalance with a fixed number of classes, while little attention is paid to data imbalance with a variable number of classes. Therefore, in this paper, we propose Federated Semi-supervised Learning for Class Variable Imbalance (FCVI) to solve class variable imbalance. The class-variable learning algorithm is used to mitigate the data imbalance due to changes of the number of classes. Our scheme is proved to be significantly better than baseline methods, while maintaining client privacy.
Brain Tissue Segmentation Across the Human Lifespan via Supervised Contrastive Learning
Chen, Xiaoyang, Wu, Jinjian, Lyu, Wenjiao, Zou, Yicheng, Thung, Kim-Han, Liu, Siyuan, Wu, Ye, Ahmad, Sahar, Yap, Pew-Thian
Automatic segmentation of brain MR images into white matter (WM), gray matter (GM), and cerebrospinal fluid (CSF) is critical for tissue volumetric analysis and cortical surface reconstruction. Due to dramatic structural and appearance changes associated with developmental and aging processes, existing brain tissue segmentation methods are only viable for specific age groups. Consequently, methods developed for one age group may fail for another. In this paper, we make the first attempt to segment brain tissues across the entire human lifespan (0-100 years of age) using a unified deep learning model. To overcome the challenges related to structural variability underpinned by biological processes, intensity inhomogeneity, motion artifacts, scanner-induced differences, and acquisition protocols, we propose to use contrastive learning to improve the quality of feature representations in a latent space for effective lifespan tissue segmentation. We compared our approach with commonly used segmentation methods on a large-scale dataset of 2,464 MR images. Experimental results show that our model accurately segments brain tissues across the lifespan and outperforms existing methods.