iPRG 2015

ABRF Proteome Informatics Research Group (iPRG) 2015 Study: Detection of differentially abundant proteins in label-free quantitative LC-MS/MS experiments
  • Organism: Saccharomyces cerevisiae
  • Instrument: Q Exactive
  • SpikeIn: Yes
Abstract
Detection of differentially abundant proteins in label-free quantitative shotgun liquid chromatography-tandem mass spectrometry (LC-MS/MS) experiments requires a series of computational steps that identify and quantify LC-MS features. It also requires statistical analyses that distinguish systematic changes in abundance between conditions from artifacts of biological and technical variation. The 2015 study of the Proteome Informatics Research Group (iPRG) of the Association of Biomolecular Resource Facilities (ABRF) aimed to evaluate the effects of the statistical analysis on the accuracy of the results. The study used LC-tandem mass spectra acquired from a controlled mixture, and made the data available to anonymous volunteer participants. The participants used methods of their choice to detect differentially abundant proteins, estimate the associated fold changes and characterize the uncertainty of the results. This manuscript summarizes the outcome of the study, and provides representative examples of good computational and statistical practice. The dataset generated as part of this study is publicly available.
Experiment Description
To investigate the ability of various computational and statistical pipelines to detect differentially abundant proteins in LC-MS/MS experiments, the 2015 Proteome Informatics Research Group (iPRG) of the Association of Biomolecular Resource Facilities (ABRF) launched a study that solicited community participation. The 2015 iPRG study provided a well-designed, controlled dataset to be analyzed by the anonymous volunteer participants using their methods of choice. It should be noted that the study did not aim to compare peptide spectrum matching or peptide peak area extraction tools. Therefore, identified LC-MS/MS spectra and their associated integrated peak intensities were also provided as optional starting points of data analysis. The participants made their own decision of whether to start from the raw data or from the provided intermediate results. The organizers of the study evaluated the submissions in terms of their ability to correctly detect the differentially abundant proteins and to accurately estimate the fold changes in protein abundance among conditions. This manuscript summarizes the outcomes of the study, highlights the importance of the correct use of computational and statistical methods, and provides representative examples of good statistical practice.
Sample Description
The study was based on four artificially made samples of known composition, each containing a constant background 200 ng of tryptic digests of S. cerevisiae (ATCC strain 204508/S288c). Each sample was separately spiked with different quantities of six individual protein digests. All of the proteins were reduced and alkylated with iodoacetamide prior to digestion with trypsin. The concentrations of the spiked-in proteins are summarized in Table 1.
Created on 10/3/16, 8:48 PM

After the conclusion of the iPRG 2015 study, the raw data were reprocessed with Skyline 3.5.0.9319 by first creating spectral libraries with varying iProphet probability cut-offs: 0.95, 0.50, 0.15 and 0.05. A template document (included below) was created with similar settings to the original (Precursor charges 2, 3, 4; Max missed cleavages 2; All modifications from the search results), with the following changes on the Transition Settings – Full-Scan tab: Precursor mass analyzer “Centroided”; Mass accuracy ±10 ppm; Use only scans within 2 minutes of MS/MS IDs. A copy of this template was created with each of the 4 spectral libraries added. The iPRG2015.TargDecoy.fasta was imported into all 4 documents, followed by these 3 operations on the Edit > Refine menu: 1) Sort Proteins > By Name, 2) Remove Duplicates, and 3) Remove Empty Proteins. FDR was then assessed in the 4 files using decoy counting (see Supplementary Table 3 – FDR: No Duplicates). Each of the for 4 files was saved to a new name and Edit > Refine > Advanced – Min peptides per protein 2 applied. FDR was reassessed in the resulting files (see Supplementary Table 3 – FDR: Min 2 Pep) and the two estimates used to assess FDR in single-peptide proteins (FDR: Single Pep). The original raw data files were then imported and the “iPRG 2015” report associated with the documents exported. The report contains identical columns to the study SkylineIntensities.tsv file, minus the Probability and QValue columns. Despite the wider 4+ minute extractions windows, some truncated peaks remained, and their intensities were replaced with NA (missing at random). As before, the ions could be quantified by three isotopic peaks. The intensities of the isotopic peaks were summed (on the original scale) before the statistical analysis. Finally, we tested the impact of using only the monoisotopic peak for quantification, compared with this summing (see Supplementary Table 3).

Supplementary Table 3:

  FDR
iProphet cut‑off No Duplicates Min 2 Pep Single‑Pep
  Protein Unique Pep Protein Unique Pep Protein
0.05 27.4% 4.0% 5.9% 1.3% 111%
0.15 17.4% 2.4% 2.3% 0.5% 77%
0.5 8.0% 1.0% 0.5% 0.1% 39%
0.95 2.1% 0.3% 0.1% 0.01% 11%

 Exported MSstats Reports:


iPRG 2015 Re-Processed Reports.zip
(113.2 MB)

Clustergrammer Heatmap
 
Download
iPRG_10ppm_2rt_50cut_nosingle_2016-08-02_23-25-24.sky.zip2016-10-03 15:20:082,93529,18933,337100,01112
iPRG_10ppm_2rt_50cut_nodup_2016-08-02_00-09-12.sky.zip2016-10-03 15:20:083,90130,15534,328102,98412
iPRG_10ppm_2rt_15cut_nodup_2016-08-05_05-22-38.sky.zip2016-10-03 15:20:084,45132,00636,320108,96012
iPRG_10ppm_2rt_05cut_nosingle_2016-08-01_11-45-26.sky.zip2016-10-03 15:20:083,32032,10836,494109,48212
iPRG_10ppm_2rt_15cut_nosingle_2016-08-01_13-54-08.sky.zip2016-10-03 15:20:083,09730,65234,937104,81112
iPRG_10ppm_2rt_05cut_nodup_2016-08-01_05-46-24.sky.zip2016-10-03 15:20:085,01733,80538,224114,67212
iPRG_10ppm_2rt_95cut_nosingle_2016-08-03_04-44-06.sky.zip2016-10-03 15:20:082,82827,57631,47994,43712
iPRG_10ppm_2rt_template_2016-08-01_05-37-39.sky.zip2016-10-03 15:20:0800000
iPRG_10ppm_2rt_95cut_nodup_2016-08-16_16-37-53.sky.zip2016-10-03 15:20:073,54728,29532,21896,65412