atom mapping
We thank all reviewers for their insightful comments and suggestions, which will be incorporated into the revised
The concurrent work G2Gs presents a similar two-step framework, while our method is more general and scalable. We keep muted about G2Gs before its conference version is available since we have some concerns about it. This discussion will be included into our revised version. Atom mapping is optional for our method. The synthon approach can also work for reactions without provided atom mapping (L208-212).
Alignment is Key for Applying Diffusion Models to Retrosynthesis
Laabid, Najwa, Rissanen, Severi, Heinonen, Markus, Solin, Arno, Garg, Vikas
Retrosynthesis, the task of identifying precursors for a given molecule, can be naturally framed as a conditional graph generation task. Diffusion models are a particularly promising modelling approach, enabling post-hoc conditioning and trading off quality for speed during generation. We show mathematically that permutation equivariant denoisers severely limit the expressiveness of graph diffusion models and thus their adaptation to retrosynthesis. To address this limitation, we relax the equivariance requirement such that it only applies to aligned permutations of the conditioning and the generated graphs obtained through atom mapping. Our new denoiser achieves the highest top-$1$ accuracy ($54.7$\%) across template-free and template-based methods on USPTO-50k. We also demonstrate the ability for flexible post-training conditioning and good sample quality with small diffusion step counts, highlighting the potential for interactive applications and additional controls for multi-step planning.
DRACON: Disconnected Graph Neural Network for Atom Mapping in Chemical Reactions
Machine learning solved many challenging problems in computer-assisted synthesis prediction (CASP). We formulate a reaction prediction problem in terms of node-classification in a disconnected graph of source molecules and generalize a graph convolution neural network for disconnected graphs. Here we demonstrate that our approach can successfully predict reaction outcome and atom-mapping during a chemical transformation. A set of experiments using the USPTO dataset demonstrates excellent performance and interpretability of the proposed model. Implicitly learned latent vector representation of chemical reactions strongly correlates with the class of the chemical reaction.