Collaborating Authors

DeepAtom: A Framework for Protein-Ligand Binding Affinity Prediction Artificial Intelligence

The cornerstone of computational drug design is the calculation of binding affinity between two biological counterparts, especially a chemical compound, i.e., a ligand, and a protein. Predicting the strength of protein-ligand binding with reasonable accuracy is critical for drug discovery. In this paper, we propose a data-driven framework named DeepAtom to accurately predict the protein-ligand binding affinity. With 3D Convolutional Neural Network (3D-CNN) architecture, DeepAtom could automatically extract binding related atomic interaction patterns from the voxelized complex structure. Compared with the other CNN based approaches, our light-weight model design effectively improves the model representational capacity, even with the limited available training data. With validation experiments on the PDBbind v.2016 benchmark and the independent Astex Diverse Set, we demonstrate that the less feature engineering dependent DeepAtom approach consistently outperforms the other state-of-the-art scoring methods. We also compile and propose a new benchmark dataset to further improve the model performances. With the new dataset as training input, DeepAtom achieves Pearson's R=0.83 and RMSE=1.23 pK units on the PDBbind v.2016 core set. The promising results demonstrate that DeepAtom models can be potentially adopted in computational drug development protocols such as molecular docking and virtual screening.

Learning from Docked Ligands: Ligand-Based Features Rescue Structure-Based Scoring Functions When Trained on Docked Poses


Machine learning scoring functions for protein–ligand binding affinity have been found to consistently outperform classical scoring functions when trained and tested on crystal structures of bound protein–ligand complexes. However, it is less clear how these methods perform when applied to docked poses of complexes. We explore how the use of docked rather than crystallographic poses for both training and testing affects the performance of machine learning scoring functions. Using the PDBbind Core Sets as benchmarks, we show that the performance of a structure-based machine learning scoring function trained and tested on docked poses is lower than that of the same scoring function trained and tested on crystallographic poses. We construct a hybrid scoring function by combining both structure-based and ligand-based features, and show that its ability to predict binding affinity using docked poses is comparable to that of purely structure-based scoring functions trained and tested on crystal poses.

APObind: A Dataset of Ligand Unbound Protein Conformations for Machine Learning Applications in De Novo Drug Design Artificial Intelligence

A drawback of these methods that perform important tasks related to methods however is that, they tend not to generalise well drug design such as receptor binding site detection, to data that does not resemble the data distribution used for small molecule docking and binding affinity training. The viability of such models therefore depend on prediction. However, these methods are usually well curated training data that translates well into real world trained on only ligand bound (or holo) conformations applications. of the protein and therefore are not guaranteed to perform well when the protein structure Deep Learning models pertaining to SBDD workflows are is in its native unbound conformation (or apo), usually trained on datasets containing 3D structures of which is usually the conformation available for protein-ligand complexes (Batool et al., 2019). PDBbind a newly identified receptor. A primary reason (Wang et al., 2005) is a predominantly used dataset that provides for this is that the local structure of the binding experimental binding affinity values for protein-ligand site usually changes upon ligand binding. To facilitate co-crystal structures present in the Protein Data Bank (PDB) solutions for this problem, we propose a (Berman et al., 2000). Deep learning architectures usually dataset called APObind that aims to provide apo use voxelized (Jiménez et al., 2018) or graph like representations conformations of proteins present in the PDBbind (Son & Kim, 2021) of the 3D structures present in dataset, a popular dataset used in drug design. Furthermore, PDBbind for computation to get benchmark performances.

Development and evaluation of a deep learning model for protein-ligand binding affinity prediction Machine Learning

Structure based ligand discovery is one of the most successful approaches for augmenting the drug discovery process. Currently, there is a notable shift towards machine learning (ML) methodologies to aid such procedures. Deep learning has recently gained considerable attention as it allows the model to "learn" to extract features that are relevant for the task at hand. We have developed a novel deep neural network estimating the binding affinity of ligand-receptor complexes. The complex is represented with a 3D grid, and the model utilizes a 3D convolution to produce a feature map of this representation, treating the atoms of both proteins and ligands in the same manner. Our network was tested on the CASF "scoring power" benchmark and Astex Diverse Set and outperformed classical scoring functions. The model, together with usage instructions and examples, is available as a git repository at

InteractionNet: Modeling and Explaining of Noncovalent Protein-Ligand Interactions with Noncovalent Graph Neural Network and Layer-Wise Relevance Propagation Machine Learning

Expanding the scope of graph-based, deep-learning models to noncovalent protein-ligand interactions has earned increasing attention in structure-based drug design. Modeling the protein-ligand interactions with graph neural networks (GNNs) has experienced difficulties in the conversion of protein-ligand complex structures into the graph representation and left questions regarding whether the trained models properly learn the appropriate noncovalent interactions. Here, we proposed a GNN architecture, denoted as InteractionNet, which learns two separated molecular graphs, being covalent and noncovalent, through distinct convolution layers. We also analyzed the InteractionNet model with an explainability technique, i.e., layer-wise relevance propagation, for examination of the chemical relevance of the model's predictions. Separation of the covalent and noncovalent convolutional steps made it possible to evaluate the contribution of each step independently and analyze the graph-building strategy for noncovalent interactions. We applied InteractionNet to the prediction of protein-ligand binding affinity and showed that our model successfully predicted the noncovalent interactions in both performance and relevance in chemical interpretation.