Heterogenous Ensemble of Models for Molecular Property Prediction

Darabi, Sajad, Fazeli, Shayan, Liu, Jiwei, Milesi, Alexandre, Morkisz, Pawel, Puget, Jean-François, Titericz, Gilberto

arXiv.org Artificial Intelligence 

The OGB Large-Scale Challenge (LSC) [Hu et al., 2021] is a Machine Learning (ML) challenge to predict a quantum chemical property, the HUMO-LUMO gap of small molecules. This ground truth is obtained via a density-functional theory (DFT) computation which is known to be time-consuming and could take several hours, even for small molecules. With the rapid advancement of machine learning technology, it is promising to use fast, GPU-accelerated and accurate ML models to replace this expensive DFT optimization process. The PCQM4Mv2 dataset, based on the PubChemQC project Nakata and Shimazaki [2017], provides us with a welldefined ML task of predicting the HOMO-LUMO gap of molecules given their 2D molecular graphs. Each molecule has two natural views. The 2D graph incorporates topological structures defined by bonds, and the 3D view provides spatial information that better reflects the geometry and spatial relation of the different bonds in the molecule.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found