Proteoforms--the different forms of proteins produced from the genome with a variety of sequence variations, splice isoforms, and myriad posttranslational modifications (1)--are critical elements in all biological systems (see the figure, left). Yang et al. (2) recently showed that the functions of proteins produced from splice variants from a given gene--different proteoforms--can be as different as those for proteins encoded by entirely different genes. Li et al. (3) showed that splice variants play a central role in modulating complex traits. However, the standard paradigm of proteomic analysis, the "bottom-up" strategy pioneered by Eng and Yates some 20 years ago (4), does not directly identify proteoforms. We argue that proteomic analysis needs to provide the identities and abundances of the proteoforms themselves, rather than just their peptide surrogates.
Proteomics is a field of study that deals with the analysis of the protein component of a cell or a tissue under a set of defined conditions. It is used to detect protein expression patterns under a particular stimulus and determine the functional protein networks at a cell or tissue level. Proteomics has major applications in medicine and drug development. Over time, Proteomics has grown into a leading method for identifying and characterising proteins, thanks to the copious amount of genomic sequence data available today. The developments in mass spectrometry, protein fractionation techniques and bioinformatics have kicked Proteomics to the next level.
Researchers are discovering a plethora of potential new biomarkers every year, each touted as the'next big thing' that will help herald a new era of precision medicine. But so far, very few have made it into clinical practice. We find out how some proteomics laboratories are now tackling this bottleneck using a factory-type setup – to get more biomarkers into the clinic, faster. "In many diseases, the medication given to the patient is often not effective – so we need to be able to stratify patients to give them the right drug, at the right dosage, at the right time," says Professor Tony Whetton, Director of the Stoller Biomarker Discovery Centre at the University of Manchester. But there is a giant hurdle in the way of this revolutionary new approach – known as precision medicine – becoming commonplace.
Proteomics is the comprehensive, integrative study of proteins and their biological functions. The goal of proteomics is often to produce a complete and quantitative map of the proteome of a species, including defining protein cellular localization, reconstructing their interaction networks and complexes, and delineating signaling pathways and regulatory post-translational protein modifications 1. Proteomic data is generally obtained using a combination of liquid chromatography (LC) and tandem mass spectrometry (MS/MS) 2, also referred to as shotgun proteomics. A key step in proteomics is how peptides are identified from acquired MS/MS spectra (Figure 1). Unlike genomics technologies, in which the DNA or RNA fragments are actually sequenced, in proteomics, peptides are most commonly identified by matching MS/MS spectra against theoretical spectra of all candidate peptides represented in a reference protein sequence database 3. The underlying assumption is that all protein-coding sequences in the genome are known and accurately annotated as a collection of gene models, and that all protein products of these gene models are present in a reference protein sequence database such as Ensembl, RefSeq, or UniProtKB used for peptide identification (Box 1). Much of the subsequent data analysis and interpretation, including inference of the protein identity 4 and protein quantification using the sequences and abundances of the identified peptides, are based on this assumption.