Goto

Collaborating Authors

 protac


SE(3)-Equivariant Ternary Complex Prediction Towards Target Protein Degradation

Xue, Fanglei, Zhang, Meihan, Li, Shuqi, Gao, Xinyu, Wohlschlegel, James A., Huang, Wenbing, Yang, Yi, Deng, Weixian

arXiv.org Artificial Intelligence

Targeted protein degradation (TPD) induced by small molecules has emerged as a rapidly evolving modality in drug discovery, targeting proteins traditionally considered "undruggable". Proteolysis-targeting chimeras (PROTACs) and molecular glue degraders (MGDs) are the primary small molecules that induce TPD. Both types of molecules form a ternary complex linking an E3 ligase with a target protein, a crucial step for drug discovery. While significant advances have been made in binary structure prediction for proteins and small molecules, ternary structure prediction remains challenging due to obscure interaction mechanisms and insufficient training data. Traditional methods relying on manually assigned rules perform poorly and are computationally demanding due to extensive random sampling. In this work, we introduce DeepTernary, a novel deep learning-based approach that directly predicts ternary structures in an end-to-end manner using an encoder-decoder architecture. DeepTernary leverages an SE(3)-equivariant graph neural network (GNN) with both intra-graph and ternary inter-graph attention mechanisms to capture intricate ternary interactions from our collected high-quality training dataset, TernaryDB. The proposed query-based Pocket Points Decoder extracts the 3D structure of the final binding ternary complex from learned ternary embeddings, demonstrating state-of-the-art accuracy and speed in existing PROTAC benchmarks without prior knowledge from known PROTACs. It also achieves notable accuracy on the more challenging MGD benchmark under the blind docking protocol. Remarkably, our experiments reveal that the buried surface area calculated from predicted structures correlates with experimentally obtained degradation potency-related metrics. Consequently, DeepTernary shows potential in effectively assisting and accelerating the development of TPDs for previously undruggable targets.


A Comprehensive Review of Emerging Approaches in Machine Learning for De Novo PROTAC Design

Gharbi, Yossra, Mercado, Rocío

arXiv.org Artificial Intelligence

Targeted protein degradation (TPD) is a rapidly growing field in modern drug discovery that aims to regulate the intracellular levels of proteins by harnessing the cell's innate degradation pathways to selectively target and degrade disease-related proteins. This strategy creates new opportunities for therapeutic intervention in cases where occupancy-based inhibitors have not been successful. Proteolysis-targeting chimeras (PROTACs) are at the heart of TPD strategies, which leverage the ubiquitin-proteasome system for the selective targeting and proteasomal degradation of pathogenic proteins. As the field evolves, it becomes increasingly apparent that the traditional methodologies for designing such complex molecules have limitations. This has led to the use of machine learning (ML) and generative modeling to improve and accelerate the development process. In this review, we explore the impact of ML on de novo PROTAC design $-$ an aspect of molecular design that has not been comprehensively reviewed despite its significance. We delve into the distinct characteristics of PROTAC linker design, underscoring the complexities required to create effective bifunctional molecules capable of TPD. We then examine how ML in the context of fragment-based drug design (FBDD), honed in the realm of small-molecule drug discovery, is paving the way for PROTAC linker design. Our review provides a critical evaluation of the limitations inherent in applying this method to the complex field of PROTAC development. Moreover, we review existing ML works applied to PROTAC design, highlighting pioneering efforts and, importantly, the limitations these studies face. By offering insights into the current state of PROTAC development and the integral role of ML in PROTAC design, we aim to provide valuable perspectives for researchers in their pursuit of better design strategies for this new modality.


An iterative refinement model for PROTAC-induced structure prediction

AIHub

This work was accepted as an oral presentation at the Generative and Experimental Perspectives for Biomolecular Design workshop at ICLR 2024. For more information, please check out our paper on arXiv. Proteins are molecular machines that carry out many of the functions required for the human body to thrive. When proteins malfunction or over-accumulate, diseases may arise. Traditional small molecule drugs are designed to inhibit these disease-causing proteins by binding to them like keys (drug) fitting into locks (protein).


PROflow: An iterative refinement model for PROTAC-induced structure prediction

Qiang, Bo, Shi, Wenxian, Song, Yuxuan, Wu, Menghua

arXiv.org Artificial Intelligence

Proteolysis targeting chimeras (PROTACs) are small molecules that trigger the breakdown of traditionally "undruggable" proteins by binding simultaneously to their targets and degradation-associated proteins. A key challenge in their rational design is understanding their structural basis of activity. Due to the lack of crystal structures (18 in the PDB), existing PROTAC docking methods have been forced to simplify the problem into a distance-constrained protein-protein docking task. To address the data issue, we develop a novel pseudo-data generation scheme that requires only binary protein-protein complexes. Its inference speed enables the large-scale screening of PROTAC designs, and computed properties of predicted structures achieve statistically significant correlations with published degradation activities. Targeted protein degradation is an emerging paradigm in rational drug design that induces the breakdown of "undruggable" proteins (Zhao et al., 2022). Proteolysis targeting chimeras (PROTACs) are small molecules that achieve this by simultaneously binding a protein of interest (POI) and a degradation-associated protein (e.g. In contrast to small molecule drugs, which attach to predefined sites on their protein targets, PROTACs operate by inducing a stable, ternary complex between themselves and two proteins which don't typically interact.


Reinforcement Learning-Driven Linker Design via Fast Attention-based Point Cloud Alignment

Neeser, Rebecca M., Akdel, Mehmet, Kovtun, Daniel, Naef, Luca

arXiv.org Artificial Intelligence

Proteolysis-Targeting Chimeras (PROTACs) represent a novel class of small molecules which are designed to act as a bridge between an E3 ligase and a disease-relevant protein, thereby promoting its subsequent degradation. PROTACs are composed of two protein binding "active" domains, linked by a "linker" domain. The design of the linker domain is challenging due to geometric and chemical constraints given by its interactions, and the need to maximize drug-likeness. To tackle these challenges, we introduce ShapeLinker, a method for de novo design of linkers. It performs fragment-linking using reinforcement learning on an autoregressive SMILES generator. The method optimizes for a composite score combining relevant physicochemical properties and a novel, attention-based point cloud alignment score. This new method successfully generates linkers that satisfy both relevant 2D and 3D requirements, and achieves state-of-the-art results in producing novel linkers assuming a target linker conformation. This allows for more rational and efficient PROTAC design and optimization. Code and data are available at https://github.com/aivant/ShapeLinker.


De novo PROTAC design using graph-based deep generative models

Nori, Divya, Coley, Connor W., Mercado, Rocío

arXiv.org Artificial Intelligence

PROteolysis TArgeting Chimeras (PROTACs) are an emerging therapeutic modality for degrading a protein of interest (POI) by marking it for degradation by the proteasome. Recent developments in artificial intelligence (AI) suggest that deep generative models can assist with the de novo design of molecules with desired properties, and their application to PROTAC design remains largely unexplored. We show that a graph-based generative model can be used to propose novel PROTAC-like structures from empty graphs. Our model can be guided towards the generation of large molecules (30--140 heavy atoms) predicted to degrade a POI through policy-gradient reinforcement learning (RL). Rewards during RL are applied using a boosted tree surrogate model that predicts a molecule's degradation potential for each POI. Using this approach, we steer the generative model towards compounds with higher likelihoods of predicted degradation activity. Despite being trained on sparse public data, the generative model proposes molecules with substructures found in known degraders. After fine-tuning, predicted activity against a challenging POI increases from 50% to >80% with near-perfect chemical validity for sampled compounds, suggesting this is a promising approach for the optimization of large, PROTAC-like molecules for targeted protein degradation.