AITopics | Liu, Tie-Yan

Collaborating Authors

Liu, Tie-Yan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

NatureLM: Deciphering the Language of Nature for Scientific Discovery

Xia, Yingce, Jin, Peiran, Xie, Shufang, He, Liang, Cao, Chuan, Luo, Renqian, Liu, Guoqing, Wang, Yue, Liu, Zequn, Chen, Yuan-Jyue, Guo, Zekun, Bai, Yeqi, Deng, Pan, Min, Yaosen, Lu, Ziheng, Hao, Hongxia, Yang, Han, Li, Jielan, Liu, Chang, Zhang, Jia, Zhu, Jianwei, Wu, Kehan, Zhang, Wei, Gao, Kaiyuan, Pei, Qizhi, Wang, Qian, Liu, Xixian, Li, Yanting, Zhu, Houtian, Lu, Yeqing, Ma, Mingqian, Wang, Zun, Xie, Tian, Maziarz, Krzysztof, Segler, Marwin, Yang, Zhao, Chen, Zilong, Shi, Yu, Zheng, Shuxin, Wu, Lijun, Hu, Chen, Dai, Peggy, Liu, Tie-Yan, Liu, Haiguang, Qin, Tao

arXiv.org Artificial IntelligenceFeb-11-2025

Foundation models have revolutionized natural language processing and artificial intelligence, significantly enhancing how machines comprehend and generate human languages. Inspired by the success of these foundation models, researchers have developed foundation models for individual scientific domains, including small molecules, materials, proteins, DNA, and RNA. However, these models are typically trained in isolation, lacking the ability to integrate across different scientific domains. Recognizing that entities within these domains can all be represented as sequences, which together form the "language of nature", we introduce Nature Language Model (briefly, NatureLM), a sequence-based science foundation model designed for scientific discovery. Pre-trained with data from multiple scientific domains, NatureLM offers a unified, versatile model that enables various applications including: (i) generating and optimizing small molecules, proteins, RNA, and materials using text instructions; (ii) cross-domain generation/design, such as protein-to-molecule and protein-to-RNA generation; and (iii) achieving state-of-the-art performance in tasks like SMILES-to-IUPAC translation and retrosynthesis on USPTO-50k. NatureLM offers a promising generalist approach for various scientific tasks, including drug discovery (hit generation/optimization, ADMET optimization, synthesis), novel material design, and the development of therapeutic proteins or nucleotides. We have developed NatureLM models in different sizes (1 billion, 8 billion, and 46.7 billion parameters) and observed a clear improvement in performance as the model size increases.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2502.07527

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.67)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Media (0.92)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.84)

Add feedback

Bridging Geometric States via Geometric Diffusion Bridge

Luo, Shengjie, Xu, Yixian, He, Di, Zheng, Shuxin, Liu, Tie-Yan, Wang, Liwei

arXiv.org Machine LearningOct-31-2024

The accurate prediction of geometric state evolution in complex systems is critical for advancing scientific domains such as quantum chemistry and material modeling. Traditional experimental and computational methods face challenges in terms of environmental constraints and computational demands, while current deep learning approaches still fall short in terms of precision and generality. In this work, we introduce the Geometric Diffusion Bridge (GDB), a novel generative modeling framework that accurately bridges initial and target geometric states. GDB leverages a probabilistic approach to evolve geometric state distributions, employing an equivariant diffusion bridge derived by a modified version of Doob's $h$-transform for connecting geometric states. This tailored diffusion process is anchored by initial and target geometric states as fixed endpoints and governed by equivariant transition kernels. Moreover, trajectory data can be seamlessly leveraged in our GDB framework by using a chain of equivariant diffusion bridges, providing a more detailed and accurate characterization of evolution dynamics. Theoretically, we conduct a thorough examination to confirm our framework's ability to preserve joint distributions of geometric states and capability to completely model the underlying dynamics inducing trajectory distributions with negligible error. Experimental evaluations across various real-world scenarios show that GDB surpasses existing state-of-the-art approaches, opening up a new pathway for accurately bridging geometric states and tackling crucial scientific challenges with improved accuracy and applicability.

artificial intelligence, geometric state, machine learning, (13 more...)

arXiv.org Machine Learning

2410.2422

Country:

North America > United States (0.45)
Europe > United Kingdom > England (0.14)

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

SFM-Protein: Integrative Co-evolutionary Pre-training for Advanced Protein Sequence Representation

He, Liang, Jin, Peiran, Min, Yaosen, Xie, Shufang, Wu, Lijun, Qin, Tao, Liang, Xiaozhuan, Gao, Kaiyuan, Jiang, Yuliang, Liu, Tie-Yan

arXiv.org Artificial IntelligenceOct-31-2024

Proteins, essential to biological systems, perform functions intricately linked to their three-dimensional structures. Understanding the relationship between protein structures and their amino acid sequences remains a core challenge in protein modeling. While traditional protein foundation models benefit from pre-training on vast unlabeled datasets, they often struggle to capture critical co-evolutionary information, which evolutionary-based methods excel at. In this study, we introduce a novel pre-training strategy for protein foundation models that emphasizes the interactions among amino acid residues to enhance the extraction of both short-range and long-range co-evolutionary features from sequence data. Trained on a large-scale protein sequence dataset, our model demonstrates superior generalization ability, outperforming established baselines of similar size, including the ESM model, across diverse downstream tasks. Experimental results confirm the model's effectiveness in integrating co-evolutionary information, marking a significant step forward in protein sequence-based modeling.

artificial intelligence, machine learning, sequence, (15 more...)

arXiv.org Artificial Intelligence

2410.24022

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Physical Consistency Bridges Heterogeneous Data in Molecular Multi-Task Learning

Ren, Yuxuan, Zheng, Dihan, Liu, Chang, Jin, Peiran, Shi, Yu, Huang, Lin, He, Jiyan, Luo, Shengjie, Qin, Tao, Liu, Tie-Yan

arXiv.org Artificial IntelligenceOct-13-2024

In recent years, machine learning has demonstrated impressive capability in handling molecular science tasks. To support various molecular properties at scale, machine learning models are trained in the multi-task learning paradigm. Nevertheless, data of different molecular properties are often not aligned: some quantities, e.g. equilibrium structure, demand more cost to compute than others, e.g. energy, so their data are often generated by cheaper computational methods at the cost of lower accuracy, which cannot be directly overcome through multi-task learning. Moreover, it is not straightforward to leverage abundant data of other tasks to benefit a particular task. To handle such data heterogeneity challenges, we exploit the specialty of molecular tasks that there are physical laws connecting them, and design consistency training approaches that allow different tasks to exchange information directly so as to improve one another. Particularly, we demonstrate that the more accurate energy data can improve the accuracy of structure prediction. We also find that consistency training can directly leverage force and off-equilibrium structure data to improve structure prediction, demonstrating a broad capability for integrating heterogeneous data.

artificial intelligence, machine learning, prediction, (15 more...)

arXiv.org Artificial Intelligence

2410.10118

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Self-Consistency Training for Density-Functional-Theory Hamiltonian Prediction

Zhang, He, Liu, Chang, Wang, Zun, Wei, Xinran, Liu, Siyuan, Zheng, Nanning, Shao, Bin, Liu, Tie-Yan

arXiv.org Artificial IntelligenceJun-5-2024

Predicting the mean-field Hamiltonian matrix in density functional theory is a fundamental formulation to leverage machine learning for solving molecular science problems. Yet, its applicability is limited by insufficient labeled data for training. In this work, we highlight that Hamiltonian prediction possesses a self-consistency principle, based on which we propose self-consistency training, an exact training method that does not require labeled data. It distinguishes the task from predicting other molecular properties by the following benefits: (1) it enables the model to be trained on a large amount of unlabeled data, hence addresses the data scarcity challenge and enhances generalization; (2) it is more efficient than running DFT to generate labels for supervised training, since it amortizes DFT calculation over a set of queries. We empirically demonstrate the better generalization in data-scarce and out-of-distribution scenarios, and the better efficiency over DFT labeling. These benefits push forward the applicability of Hamiltonian prediction to an ever-larger scale.

artificial intelligence, machine learning, self-consistency training, (18 more...)

arXiv.org Artificial Intelligence

2403.0956

Country:

North America > United States (0.45)
Asia > China (0.28)
Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.46)
Government > Regional Government > North America Government > United States Government (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

LordNet: An Efficient Neural Network for Learning to Solve Parametric Partial Differential Equations without Simulated Data

Huang, Xinquan, Shi, Wenlei, Gao, Xiaotian, Wei, Xinran, Zhang, Jia, Bian, Jiang, Yang, Mao, Liu, Tie-Yan

arXiv.org Artificial IntelligenceMay-7-2024

Neural operators, as a powerful approximation to the non-linear operators between infinite-dimensional function spaces, have proved to be promising in accelerating the solution of partial differential equations (PDE). However, it requires a large amount of simulated data, which can be costly to collect. This can be avoided by learning physics from the physics-constrained loss, which we refer to it as mean squared residual (MSR) loss constructed by the discretized PDE. We investigate the physical information in the MSR loss, which we called long-range entanglements, and identify the challenge that the neural network requires the capacity to model the long-range entanglements in the spatial domain of the PDE, whose patterns vary in different PDEs. To tackle the challenge, we propose LordNet, a tunable and efficient neural network for modeling various entanglements. Inspired by the traditional solvers, LordNet models the long-range entanglements with a series of matrix multiplications, which can be seen as the low-rank approximation to the general fully-connected layers and extracts the dominant pattern with reduced computational cost. The experiments on solving Poisson's equation and (2D and 3D) Navier-Stokes equation demonstrate that the long-range entanglements from the MSR loss can be well modeled by the LordNet, yielding better accuracy and generalization ability than other neural networks. The results show that the Lordnet can be $40\times$ faster than traditional PDE solvers. In addition, LordNet outperforms other modern neural network architectures in accuracy and efficiency with the smallest parameter size.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.neunet.2024.106354

2206.09418

Genre: Research Report > New Finding (0.48)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

FABind: Fast and Accurate Protein-Ligand Binding

Pei, Qizhi, Gao, Kaiyuan, Wu, Lijun, Zhu, Jinhua, Xia, Yingce, Xie, Shufang, Qin, Tao, He, Kun, Liu, Tie-Yan, Yan, Rui

arXiv.org Artificial IntelligenceJan-8-2024

Modeling the interaction between proteins and ligands and accurately predicting their binding structures is a critical yet challenging task in drug discovery. Recent advancements in deep learning have shown promise in addressing this challenge, with sampling-based and regression-based methods emerging as two prominent approaches. However, these methods have notable limitations. Sampling-based methods often suffer from low efficiency due to the need for generating multiple candidate structures for selection. On the other hand, regression-based methods offer fast predictions but may experience decreased accuracy. Additionally, the variation in protein sizes often requires external modules for selecting suitable binding pockets, further impacting efficiency. In this work, we propose $\mathbf{FABind}$, an end-to-end model that combines pocket prediction and docking to achieve accurate and fast protein-ligand binding. $\mathbf{FABind}$ incorporates a unique ligand-informed pocket prediction module, which is also leveraged for docking pose estimation. The model further enhances the docking process by incrementally integrating the predicted pocket to optimize protein-ligand binding, reducing discrepancies between training and inference. Through extensive experiments on benchmark datasets, our proposed $\mathbf{FABind}$ demonstrates strong advantages in terms of effectiveness and efficiency compared to existing methods. Our code is available at https://github.com/QizhiPei/FABind

artificial intelligence, machine learning, prediction, (17 more...)

arXiv.org Artificial Intelligence

2310.06763

Country:

Asia > China (0.14)
North America > United States (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

SoftCorrect: Error Correction with Soft Detection for Automatic Speech Recognition

Leng, Yichong, Tan, Xu, Liu, Wenjie, Song, Kaitao, Wang, Rui, Li, Xiang-Yang, Qin, Tao, Lin, Edward, Liu, Tie-Yan

arXiv.org Artificial IntelligenceDec-20-2023

Error correction in automatic speech recognition (ASR) aims to correct those incorrect words in sentences generated by ASR models. Since recent ASR models usually have low word error rate (WER), to avoid affecting originally correct tokens, error correction models should only modify incorrect words, and therefore detecting incorrect words is important for error correction. Previous works on error correction either implicitly detect error words through target-source attention or CTC (connectionist temporal classification) loss, or explicitly locate specific deletion/substitution/insertion errors. However, implicit error detection does not provide clear signal about which tokens are incorrect and explicit error detection suffers from low detection accuracy. In this paper, we propose SoftCorrect with a soft error detection mechanism to avoid the limitations of both explicit and implicit error detection. Specifically, we first detect whether a token is correct or not through a probability produced by a dedicatedly designed language model, and then design a constrained CTC loss that only duplicates the detected incorrect tokens to let the decoder focus on the correction of error tokens. Compared with implicit error detection with CTC loss, SoftCorrect provides explicit signal about which words are incorrect and thus does not need to duplicate every token but only incorrect tokens; compared with explicit error detection, SoftCorrect does not detect specific deletion/substitution/insertion errors but just leaves it to CTC loss. Experiments on AISHELL-1 and Aidatatang datasets show that SoftCorrect achieves 26.1% and 9.4% CER reduction respectively, outperforming previous works by a large margin, while still enjoying fast speed of parallel generation.

data quality, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2212.01039

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Quality > Data Cleaning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Monte Carlo Neural PDE Solver for Learning PDEs via Probabilistic Representation

Zhang, Rui, Meng, Qi, Zhu, Rongchan, Wang, Yue, Shi, Wenlei, Zhang, Shihua, Ma, Zhi-Ming, Liu, Tie-Yan

arXiv.org Artificial IntelligenceNov-23-2023

In scenarios with limited available or high-quality data, training the function-to-function neural PDE solver in an unsupervised manner is essential. However, the efficiency and accuracy of existing methods are constrained by the properties of numerical algorithms, such as finite difference and pseudo-spectral methods, integrated during the training stage. These methods necessitate careful spatiotemporal discretization to achieve reasonable accuracy, leading to significant computational challenges and inaccurate simulations, particularly in cases with substantial spatiotemporal variations. To address these limitations, we propose the Monte Carlo Neural PDE Solver (MCNP Solver) for training unsupervised neural solvers via the PDEs' probabilistic representation, which regards macroscopic phenomena as ensembles of random particles. Compared to other unsupervised methods, MCNP Solver naturally inherits the advantages of the Monte Carlo method, which is robust against spatiotemporal variations and can tolerate coarse step size. In simulating the random walk of particles, we employ Heun's method for the convection process and calculate the expectation via the probability density function of neighbouring grid points during the diffusion process. These techniques enhance accuracy and circumvent the computational memory and time issues associated with Monte Carlo sampling, offering an improvement over traditional Monte Carlo methods. Our numerical experiments on convection-diffusion, Allen-Cahn, and Navier-Stokes equations demonstrate significant improvements in accuracy and efficiency compared to other unsupervised baselines. The source code will be publicly available at: https://github.com/optray/MCNP.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2302.05104

Country:

Asia > China (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.81)

Technology:

Information Technology > Mathematics of Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity Prediction

Pei, Qizhi, Wu, Lijun, Zhu, Jinhua, Xia, Yingce, Xie, Shufang, Qin, Tao, Liu, Haiguang, Liu, Tie-Yan, Yan, Rui

arXiv.org Artificial IntelligenceOct-17-2023

Accurate prediction of Drug-Target Affinity (DTA) is of vital importance in early-stage drug discovery, facilitating the identification of drugs that can effectively interact with specific targets and regulate their activities. While wet experiments remain the most reliable method, they are time-consuming and resource-intensive, resulting in limited data availability that poses challenges for deep learning approaches. Existing methods have primarily focused on developing techniques based on the available DTA data, without adequately addressing the data scarcity issue. To overcome this challenge, we present the SSM-DTA framework, which incorporates three simple yet highly effective strategies: (1) A multi-task training approach that combines DTA prediction with masked language modeling (MLM) using paired drug-target data. (2) A semi-supervised training method that leverages large-scale unpaired molecules and proteins to enhance drug and target representations. This approach differs from previous methods that only employed molecules or proteins in pre-training. (3) The integration of a lightweight cross-attention module to improve the interaction between drugs and targets, further enhancing prediction accuracy. Through extensive experiments on benchmark datasets such as BindingDB, DAVIS, and KIBA, we demonstrate the superior performance of our framework. Additionally, we conduct case studies on specific drug-target binding activities, virtual screening experiments, drug feature visualizations, and real-world applications, all of which showcase the significant potential of our work. In conclusion, our proposed SSM-DTA framework addresses the data limitation challenge in DTA prediction and yields promising results, paving the way for more efficient and accurate drug discovery processes. Our code is available at $\href{https://github.com/QizhiPei/SSM-DTA}{Github}$.

machine learning, natural language, prediction, (20 more...)

arXiv.org Artificial Intelligence

2206.09818

Country: Asia > China > Anhui Province (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback