Goto

Collaborating Authors

 sactipeptide


Non-Canonical Crosslinks Confound Evolutionary Protein Structure Models

Lacombe, Romain

arXiv.org Artificial Intelligence

Sactipeptides--short for sulfur-to-α-carbon thioether-containing peptide--are a small but growing subclass of RiPPs natural products characterized by one or more intramolecular thioether linkages known as sactionine bonds. These unique cross-links are formed when a radical S-adenosylmethionine (rSAM) enzyme facilitates the covalent bonding of the sulfur atom of a cysteine residue to the α-carbon of another amino acid in the peptide backbone. The result is a tightly cross-linked polycyclic peptide in which the thioether bridges structure from residue to backbone imparts rigidity and extreme stability against heat, pH, and proteases (Flühe & Marahiel, 2013). While the first sactipeptide--subtilosin A, an antibiotic produced by Bacillus subtilis 168--was discovered in 1985 (Babasaki et al., 1985), this class of RiPPs is still very rare, with the pace of discovery only ramping up in recent years thanks to advances in genome mining (Chen et al., 2021; Zhong et al., 2023; Wambui et al., 2022). A literature search reveals that to date, only 10 sactipeptides have a known sequence and fully elucidated cross-links structure. Of these, only 5 sactipeptides-- ruminococcin C1 (Roblin et al., 2020), subtilosin A (Kawulka et al., 2004), thurincin H (Sit et al., 2011b), thuricin CD α, and thuricin CD β (Sit et al., 2011a)--have an experimentally resolved 3D structure available in the PDB. To the best of our knowledge, the remaining 5 sactipeptides-- huazacin (Hudson et al., 2019), hyicin 4244 (Duarte et al., 2018), sporulation killing factor A (Cao et al., 2021), streptosactin (Chen et al., 2021), and QmpA (Ali et al., 2022)--do not have a known structure. We lists these 10 sactipeptides and their post translational cross-links in table 1. Because half of these peptides are present in the PDB, and the other half have identified cross-links but not yet an experimentally resolved 3D structure, they form an ideal held-out dataset for an out-of-domain evaluation of the robustness of protein structure prediction models.