Lim, Jaechang
C3Net: interatomic potential neural network for prediction of physicochemical properties in heterogenous systems
Lee, Sehan, Lim, Jaechang, Kim, Woo Youn
Understanding the interactions of a solute with its environment is of fundamental importance in chemistry and biology. In this work, we propose a deep neural network architecture for atom type embeddings in its molecular context and interatomic potential that follows fundamental physical laws. The architecture is applied to predict physicochemical properties in heterogeneous systems including solvation in diverse solvents, 1-octanol-water partitioning, and PAMPA with a single set of network weights. We show that our architecture is generalized well to the physicochemical properties and outperforms state-of-the-art approaches based on quantum mechanics and neural networks in the task of solvation free energy prediction. The interatomic potentials at each atom in a solute obtained from the model allow quantitative analysis of the physicochemical properties at atomic resolution consistent with chemical and physical reasoning. The software is available at https://github.com/SehanLee/C3Net.
PIGNet2: A Versatile Deep Learning-based Protein-Ligand Interaction Prediction Model for Binding Affinity Scoring and Virtual Screening
Moon, Seokhyun, Hwang, Sang-Yeon, Lim, Jaechang, Kim, Woo Youn
Prediction of protein-ligand interactions (PLI) plays a crucial role in drug discovery as it guides the identification and optimization of molecules that effectively bind to target proteins. Despite remarkable advances in deep learning-based PLI prediction, the development of a versatile model capable of accurately scoring binding affinity and conducting efficient virtual screening remains a challenge. The main obstacle in achieving this lies in the scarcity of experimental structure-affinity data, which limits the generalization ability of existing models. Here, we propose a viable solution to address this challenge by introducing a novel data augmentation strategy combined with a physics-informed graph neural network. The model showed significant improvements in both scoring and screening, outperforming task-specific deep learning models in various tests including derivative benchmarks, and notably achieving results comparable to the state-of-the-art performance based on distance likelihood learning. This demonstrates the potential of this approach to drug discovery.
Scaffold-based molecular design using graph generative model
Lim, Jaechang, Hwang, Sang-Yeon, Kim, Seungsu, Moon, Seokhyun, Kim, Woo Youn
Searching new molecules in areas like drug discovery often starts from the core structures of candidate molecules to optimize the properties of interest. The way as such has called for a strategy of designing molecules retaining a particular scaffold as a substructure. On this account, our present work proposes a scaffold-based molecular generative model. The model generates molecular graphs by extending the graph of a scaffold through sequential additions of vertices and edges. In contrast to previous related models, our model guarantees the generated molecules to retain the given scaffold with certainty. Our evaluation of the model using unseen scaffolds showed the validity, uniqueness, and novelty of generated molecules as high as the case using seen scaffolds. This confirms that the model can generalize the learned chemical rules of adding atoms and bonds rather than simply memorizing the mapping from scaffolds to molecules during learning. Furthermore, despite the restraint of fixing core structures, our model could simultaneously control multiple molecular properties when generating new molecules.
Predicting drug-target interaction using 3D structure-embedded graph representations from graph neural networks
Lim, Jaechang, Ryu, Seongok, Park, Kyubyong, Choe, Yo Joong, Ham, Jiyeon, Kim, Woo Youn
Accurate prediction of drug-target interaction (DTI) is essential for in silico drug design. For the purpose, we propose a novel approach for predicting DTI using a GNN that directly incorporates the 3D structure of a protein-ligand complex. We also apply a distance-aware graph attention algorithm with gate augmentation to increase the performance of our model. As a result, our model shows better performance than docking and other deep learning methods for both virtual screening and pose prediction. In addition, our model can reproduce the natural population distribution of active molecules and inactive molecules.
Molecular generative model based on conditional variational autoencoder for de novo molecular design
Lim, Jaechang, Ryu, Seongok, Kim, Jin Woo, Kim, Woo Youn
We propose a molecular generative model based on the conditional variational autoencoder for de novo molecular design. It is specialized to control multiple molecular properties simultaneously by imposing them on a latent space. As a proof of concept, we demonstrate that it can be used to generate drug-like molecules with five target properties. We were also able to adjust a single property without changing the others and to manipulate it beyond the range of the dataset.
Deeply learning molecular structure-property relationships using graph attention neural network
Ryu, Seongok, Lim, Jaechang, Kim, Woo Youn
Molecular structure-property relationships are the key to molecular engineering for materials and drug discovery. The rise of deep learning offers a new viable solution to elucidate the structure-property relationships directly from chemical data. Here we show that graph attention networks can greatly improve performance of the deep learning for chemistry. The attention mechanism enables to distinguish atoms in different environments and thus to extract important structural features determining target properties. We demonstrated that our model can detect appropriate features for molecular polarity, solubility, and energy. Interestingly, it identified two distinct parts of molecules as essential structural features for high photovoltaic efficiency, each of which coincided with the area of donor and acceptor orbitals in charge-transfer excitations, respectively. As a result, it could accurately predict molecular properties. Moreover, the resultant latent space was well-organized such that molecules with similar properties were closely located, which is critical for successful molecular engineering.