BigBio Notes: Some of the most cited manuscripts in Proteomics and Computational Proteomics (2013)

Some of the most cited manuscripts in 2013 in the field of Proteomics and Computational Proteomics (no order):

The Proteomics Identifications (PRIDE) database and associated tools: status in 2013:

     The PRoteomics IDEntifications (PRIDE, http://www.ebi.ac.uk/pride) database
     at the European Bioinformatics Institute is one of the most prominent data
     repositories of mass spectrometry (MS)-based proteomics data. Here, we
     summarize recent developments in the PRIDE database and related tools.
     First, we provide up-to-date statistics in data content, splitting the figures by
     groups of organisms and species, including peptide and protein
     identifications, and post-translational modifications. We then describe the
     tools that are part of the PRIDE submission pipeline, especially the recently
     developed PRIDE Converter 2 (new submission tool) and PRIDE Inspector
     (visualization and analysis tool). We also give an update about the integration
     of PRIDE with other MS proteomics resources in the context of the
     ProteomeXchange consortium. Finally, we briefly review the quality control
     efforts that are ongoing at present and outline our future plans.

Next-generation proteomics: towards an integrative view of proteome dynamics

   Next-generation sequencing allows the analysis of genomes, including those
   representing disease states. However, the causes of most disorders are
     multifactorial, and systems-level approaches, including the analysis of
     proteomes, are required for a more comprehensive understanding. The
     proteome is extremely multifaceted owing to splicing and protein
     modifications, and this is further amplified by the interconnectivity of proteins
     into complexes and signalling networks that are highly divergent in time and
     space. Proteome analysis heavily relies on mass spectrometry (MS).
     MS-based proteomics is starting to mature and to deliver through a
     combination of developments in instrumentation, sample preparation and
     computational analysis. Here we describe this emerging next generation of
     proteomics and highlight recent applications.

Quantitative measurements of N-linked glycoproteins in human plasma by SWATH-MS

   SWATH-MS is a data-independent acquisition method that generates, in a
   single measurement, a complete recording of the fragment ion spectra of all
     the analytes in a biological sample for which the precursor ions are within a
     predetermined m/z versus retention time window. To assess the performance
     and suitability of SWATH-MS-based protein quantification for clinical use, we
     compared SWATH-MS and SRM-MS-based quantification of N-linked
     glycoproteins in human plasma, a commonly used sample for biomarker
     discovery. Using dilution series of isotopically labeled heavy peptides
     representing biomarker candidates, the LOQ of SWATH-MS was determined
     to reach 0.0456 fmol at peptide level by targeted data analysis, which
     corresponds to a concentration of 5–10 ng protein/mL in plasma, while SRM
     reached a peptide LOQ of 0.0152 fmol. Moreover, the quantification of
     endogenous glycoproteins using SWATH-MS showed a high degree of
     reproducibility, with the mean CV of 14.90%, correlating well with SRM results
     (R² = 0.9784). Overall, SWATH-MS measurements showed a slightly lower
   sensitivity and a comparable reproducibility to state-of-the-art SRM
   measurements for targeted quantification of the N-glycosites in human
     blood. However, a significantly larger number of peptides can be quantified
     per analysis. We suggest that SWATH-MS analysis combined with
     N-glycoproteome enrichment in plasma samples is a promising integrative
     proteomic approach for biomarker discovery and verification.

Technologies and challenges in large-scale phosphoproteomics

       Phosphorylation, the reversible addition of a phosphate group to amino acid
       side chains of proteins, is a fundamental regulator of protein activity,
       stability, and molecular interactions. Most cellular processes, such as inter-
       and intracellular signaling, protein synthesis, degradation, and apoptosis,
       rely on phosphorylation. This PTM is thus involved in many diseases,
       rendering localization and assessment of extent of phosphorylation of major
       scientific interest. MS-based phosphoproteomics, which aims at describing
       all phosphorylation sites in a specific type of cell, tissue, or organism, has
       become the main technique for discovery and characterization of
       phosphoproteins in a nonhypothesis driven fashion. In this review, we
       describe methods for state-of-the-art MS-based analysis of protein
       phosphorylation as well as the strategies employed in large-scale
       phosphoproteomic experiments with focus on the various challenges and
       limitations this field currently faces.

A complete mass-spectrometric map of the yeast proteome applied to quantitative trait analysis

Experience from different fields of life sciences suggests that accessible,
      complete reference maps of the components of the system under study are
      highly beneficial research tools. Examples of such maps include libraries of
      the spectroscopic properties of molecules, or databases of drug structures in
      analytical or forensic chemistry. Such maps, and methods to navigate them,
      constitute reliable assays to probe any sample for the presence and amount
      of molecules contained in the map. So far, attempts to generate such maps
      for any proteome have failed to reach complete proteome coverage1, 2, 3.
      Here we use a strategy based on high-throughput peptide synthesis and
      mass spectrometry to generate an almost complete reference map (97% of
      the genome-predicted proteins) of the Saccharomyces cerevisiae proteome.
      We generated two versions of this mass-spectrometric map, one supporting
      discovery-driven (shotgun)3, 4 and the other supporting hypothesis-driven
      (targeted)5, 6 proteomic measurements. Together, the two versions of the
      map constitute a complete set of proteomic assays to support most studies
      performed with contemporary proteomic technologies. To show the utility of
      the maps, we applied them to a protein quantitative trait locus (QTL)
      analysis7, which requires precise measurement of the same set of peptides
      over a large number of samples. Protein measurements over 78 S.
      cerevisiae strains revealed a complex relationship between independent
      genetic loci, influencing the levels of related proteins. Our results suggest
      that selective pressure favours the acquisition of sets of polymorphisms that
      adapt protein levels but also maintain the stoichiometry of functionally
      related pathway members.

The Coming Age of Complete, Accurate, and Ubiquitous Proteomes

     High-resolution mass spectrometry (MS)-based proteomics has progressed
   tremendously over the years. For model organisms like yeast, we can now
   quantify complete proteomes in just a few hours. Developments discussed in
     this Perspective will soon enable complete proteome analysis of mammalian
     cells, as well, with profound impact on biology and biomedicine.

Peptidomic discovery of short open reading frame–encoded peptides in human cells

       The complete extent to which the human genome is translated into
   polypeptides is of fundamental importance. We report a peptidomic
   strategy to detect short open reading frame (sORF)-encoded polypeptides
   (SEPs) in human cells. We identify 90 SEPs, 86 of which are previously
   uncharacterized, which is the largest number of human SEPs ever reported.
       SEP abundances range from 10–1,000 molecules per cell, identical to
       abundances of known proteins. SEPs arise from sORFs in noncoding RNAs as
       well as multicistronic mRNAs, and many SEPs initiate with non-AUG start
       codons, indicating that noncanonical translation may be more widespread in
       mammals than previously thought. In addition, coding sORFs are present in
       a small fraction (8 out of 1,866) of long intergenic noncoding RNAs.
       Together, these results provide strong evidence that the human proteome is
       more complex than previously appreciated.

Construction of human activity‐based phosphorylation networks

      The landscape of human phosphorylation networks has not been
systematically explored, representing vast, unchartered territories within
      cellular signaling networks. Although a large number of in vivo
      phosphorylated residues have been identified by mass spectrometry
   (MS)‐based approaches, assigning the upstream kinases to these residues
      requires biochemical analysis of kinase‐substrate relationships (KSRs). Here,
      we developed a new strategy, called CEASAR, based on functional protein
      microarrays and bioinformatics to experimentally identify substrates for 289
      unique kinases, resulting in 3656 high‐quality KSRs. We then generated
      consensus phosphorylation motifs for each of the kinases and integrated this
      information, along with information about in vivo phosphorylation sites
      determined by MS, to construct a high‐resolution map of phosphorylation
      networks that connects 230 kinases to 2591 in vivo phosphorylation sites in
      652 substrates. The value of this data set is demonstrated through the
      discovery of a new role for PKA downstream of Btk (Bruton's tyrosine kinase)
      during B‐cell receptor signaling. Overall, these studies provide global insights
      into kinase‐mediated signaling pathways and promise to advance our
      understanding of cellular signaling processes in humans.

The CRAPome: a contaminant repository for affinity purification–mass spectrometry data

      Affinity purification coupled with mass spectrometry (AP-MS) is a widely used
      approach for the identification of protein-protein interactions. However, for
      any given protein of interest, determining which of the identified
      polypeptides represent bona fide interactors versus those that are
      background contaminants (for example, proteins that interact with the
      solid-phase support, affinity reagent or epitope tag) is a challenging task.
      The standard approach is to identify nonspecific interactions using one or
      more negative-control purifications, but many small-scale AP-MS studies do
      not capture a complete, accurate background protein set when available
      controls are limited. Fortunately, negative controls are largely bait
      independent. Hence, aggregating negative controls from multiple AP-MS
      studies can increase coverage and improve the characterization of
      background associated with a given experimental protocol. Here we present
      the contaminant repository for affinity purification (the CRAPome) and
      describe its use for scoring protein-protein interactions. The repository
      (currently available for Homo sapiens and Saccharomyces cerevisiae)
and computational tools are freely accessible at http://www.crapome.org/.

Benchmarking stable isotope labeling based quantitative proteomics

   Several quantitative mass spectrometry based technologies have recently
   evolved to interrogate the complexity, interconnectivity and dynamic nature
       of proteomes. Currently, the most popular methods use either metabolic or
       chemical isotope labeling with MS based quantification or chemical labeling
       using isobaric tags with MS/MS based quantification. Here, we assess the
       performance of three of the most popular approaches through systematic
       independent large scale quantitative proteomics experiments, comparing
       SILAC, dimethyl and TMT labeling strategies. Although all three methods
       have their strengths and weaknesses, our data indicate that all three can
       reach a similar depth in number of identified proteins using a classical (MS2
       based) shotgun approach. TMT quantification using only MS2 is heavily
       affected by co-isolation leading to compromised precision and accuracy.
       This issue may be partly resolved by using an MS3 based acquisition;
       however, at the cost of a significant reduction in number of proteins
       quantified. Interestingly, SILAC and chemical labeling with MS based
       quantification produce almost indistinguishable results, independent of
       which database search algorithm used.

In Vivo Protein Interaction Network Identified with a Novel Real-Time Cross-Linked Peptide Identification Strategy

   Protein interaction topologies are critical determinants of biological function.
       Large-scale or proteome-wide measurements of protein interaction
       topologies in cells currently pose an unmet challenge that could
   dramatically improve understanding of complex biological systems. A
       primary impediment includes direct protein topology and interaction
       measurements from living systems since interactions that lack biological
       significance may be introduced during cell lysis. Furthermore, many
       biologically relevant protein interactions will likely not survive the
       lysis/sample preparation and may only be measured with in vivo methods.
       As a step toward meeting this challenge, a new mass spectrometry method
       called Real-time Analysis for Cross-linked peptide Technology (ReACT) has
     been developed that enables assignment of cross-linked peptides
      “on-the-fly”. Using ReACT, 708 unique cross-linked (<5% FDR) peptide pairs
       were identified from cross-linked E. coli cells. These data allow assembly of
       the first protein interaction network that also contains topological features
       of every interaction, as it existed in cells during cross-linker application. Of
       the identified interprotein cross-linked peptide pairs, 40% are derived from
       known interactions and provide new topological data that can help visualize
       how these interactions exist in cells. Other identified cross-linked peptide
       pairs are from proteins known to be involved within the same complex, but
       yield newly discovered direct physical interactors. ReACT enables the first
       view of these interactions inside cells, and the results acquired with this
       method suggest cross-linking can play a major role in future efforts to map
       the interactome in cells.

Metabolomics coupled with proteomics advancing drug discovery towards more agile development of targeted combination therapies.

       To enhance therapeutic efficacy and reduce adverse effects of traditional
       Chinese medicine (TCM), practitioners often prescribe a combination of
       plant species and/or minerals called formulae. Unfortunately, the working
       mechanisms of most of these compounds are difficult to determine and
       thus remain unknown. In an attempt to address the benefits of formulae
       based on current biomedical approaches, we analyzed the components of
       Yinchenhao Tang (YCHT), a classical formula and has been shown to be
   clinically effective for treating hepatic injury (HI) syndrome. The three
       principal components of YCHT are Artemisia annua L., Gardenia jasminoids
       Ellis, and Rheum Palmatum L., whose major active ingredients are 6,7 -
       dimethylesculetin (D), geniposide (G) and rhein (R), respectively. To
       determine the mechanisms that underlie this formula, we conducted a
   systematic analysis of the therapeutic effects of the DGR compound using
       immunohistochemistry, biochemistry, metabolomics and proteomics. Here,
       we report that the DGR combination exerts a more robust therapeutic effect
than any one or two of the three individual compounds by hitting multiple
targets in a rat model of HI. Thus, DGR synergistically causes intensified
   dynamic changes in metabolic biomarkers, regulates molecular networks
   through target proteins, has a synergistic/additive effect and activates both
     intrinsic and extrinsic pathways.

BigBio Notes

Monday, 20 January 2014

Some of the most cited manuscripts in Proteomics and Computational Proteomics (2013)

No comments:

Post a Comment