Generative Modeling of Full-Atom Protein Conformations using Latent Diffusion on Graph Embeddings
Sengar, Aditya, Hariri, Ali, Probst, Daniel, Barth, Patrick, Vandergheynst, Pierre
–arXiv.org Artificial Intelligence
Generating diverse, all-atom conformational ensembles of dynamic proteins such as G-protein-coupled receptors (GPCRs) is critical for understanding their function, yet most generative models simplify atomic detail or ignore conformational diversity altogether. We present latent diffusion for full protein generation (LD-FPG), a framework that constructs complete all-atom protein structures, including every side-chain heavy atom, directly from molecular dynamics (MD) trajectories. LD-FPG employs a Chebyshev graph neural network (ChebNet) to obtain low-dimensional latent embeddings of protein conformations, which are processed using three pooling strategies: blind, sequential and residue-based. A diffusion model trained on these latent representations generates new samples that a decoder, optionally regularized by dihedral-angle losses, maps back to Cartesian coordinates. Using D2R-MD, a 2-microsecond MD trajectory (12 000 frames) of the human dopamine D2 receptor in a membrane environment, the sequential and residue-based pooling strategy reproduces the reference ensemble with high structural fidelity (all-atom lDDT of approximately 0.7; C-alpha-lDDT of approximately 0.8) and recovers backbone and side-chain dihedral-angle distributions with a Jensen-Shannon divergence of less than 0.03 compared to the MD data. LD-FPG thereby offers a practical route to system-specific, all-atom ensemble generation for large proteins, providing a promising tool for structure-based therapeutic design on complex, dynamic targets. The D2R-MD dataset and our implementation are freely available to facilitate further research.
arXiv.org Artificial Intelligence
Aug-19-2025
- Country:
- Asia > Middle East
- Yemen > Amanat Al Asimah > Sanaa (0.04)
- Europe
- Netherlands (0.04)
- Switzerland > Vaud
- Lausanne (0.04)
- North America > United States (0.28)
- Asia > Middle East
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Energy (0.92)
- Government > Regional Government
- North America Government (0.46)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Information Technology (0.93)
- Technology: