Gas phase fractionation DIA

To start, we will acquire gas phase fractionated DIA data. Here, we will simulate a pilot experiment, where we acquire GPF data on several pooled samples to identify differentially expressed proteins. This tutorial has a brief overview of building a chromatogram library. More detailed information can be found in Label Free- E. coli tutorial.

Building methods

We begin by making a template method with an adaptive RT experiment and DIA experiment. The template method should have: 

  1. LC parameters to be used in the final experiment
  2. Source settings to be used in the final experiment
  3. Adaptive RT DIA experiment spanning the full LC gradient
  4. DIA experiment with appropriate settings (DIA window parameters don’t matter as they will be updated shortly)
 

The template method is in the Raw Data folder, titled: GPFDIA_28min_1Th.meth.

 

Open Skyline on a computer with instrument control software installed (you should be able to open .meth files in Method Editor on this computer), ensure that most recent version of PRM Conductor is installed from Tool Store. Open the GPF Creator tool.

Update the settings, select the template method file, and export the template methods.

GPF creator will save GPF DIA methods to the folder that contains the template method. Create an Xcalibur queue and run the GPF DIA data on desired samples.

Searching Stellar GPF data

Search the data in Proteome Discoverer with Chimerys, using Stellar_GPF_Chimerys.pdProcessingWF and the standard consensus workflow.

Note that we changed CE for DIA experiment to 27% in our template method, so we have a max collision energy filter to avoid searching any of the alignment spectra. This will need to be removed if the DIA experiment used the default CE of 30%.  Assuming that the adaptive RT experiment used the 200 kDa/s scan rate, the scan headers for its acquisitions will have a 'Z' character in them, corresponding to the legacy Zoom scan rate.  Therefore, the Scan Event Filter, Scan Type Is Not z can filter out these adaptive RT acquisitions, and the CE filter is not strictly needed.

Here, I only searched the GPF data for sample C to build the library. I would recommend even with a pilot experiment to search one pooled sample to build the library from. This will simplify assay creation. If the samples are not similar enough, more samples may be searched here.

 

Filtering search results in Skyline

Importing results

Once the search is done, open Skyline and then open the GPF Importer tool. Select PD results file, an appropriate FASTA, and raw files searched in PD. Select “Import to Skyline”. GPF Importer will create a Skyline document with all the search results imported. You will end up with 3Proteome_28min_GPFDIA_GPFImporter.sky.zip.

Filtering for quality peptides 

Now, we want to go ahead and filter for well-behaved peptides. First, we want to clean up data just a bit.

Go to settings > Transition settings and add a scan filter as shown to ensure that no alignment spectra are imported.

Then go to to Edit > Manage Results and reimport Chrom_Lib_Replicate to ensure there are no more alignment spectra in file.

Now, we can use PRM Conductor to filter for peptides with good peaks. Open the PRM Conductor external tool. You can play around with parameters under Refine targets, this is better explained in other tutorials. I used the following settings:

Ignore the Define Method section and go down to Create method. Uncheck box that says “Balance load”. If you were to export a method from here, PRM Conductor would create enough individual PRM methods to cover all the peptides in library. Here we want to further process the data, so instead of clicking Export Files, click “Send to Skyline”. 

This will give you a Skyline document refined for peptides IDed at 1% FDR with “good” chromatographic peaks. This reduced the document from 8637 proteins/ 43341 peptides to 4591 proteins/ 14480 peptides.

Filtering data based on pilot experiment

After refining the GPF data, you may want to further refine the list of peptides based on pilot experiment. There are many ways that one could do this, like generating multiple PRM assays with PRM Conductor and running on samples from different conditions, or doing wider window DIA data. In this case, I had decided to do a pilot experiment on 2 different 3 proteome mixes, so I could filter for quantitative peptides.

Import each GPF replicate by selecting “Add new replicate” and selecting all GPF files:

After this step, you will end up with 3Proteome_28min_GPFDIA_PRMConductorRefined.sky.

Peak picking in pilot experiment

Because we didn’t search these GPF runs in the library, we don’t know if the peak picking is good. So an optional next step is to perform Expert Review analysis with the GPF run we did search as reference. To do this, save the document and then open the Expert Review External tool.

Select “Current” reference file, then check the GPF run(s) that you searched in PD. We will assume that this has correct retention times. Click Start to begin peak picking.

Once peak picking is finished, you will see live view of peak picking metrics. More information on this can be found in Expert Review tutorial. Click Send to sync boundaries to Skyline.

At the end of this, you should end up with something like 3Proteome_28min_GPFDIA_PRMConductorRefined_ER.sky.

Exporting quantitative information from pilot experiment

Now that we have pilot data imported with corrected peaks, we can process the data and apply filters. Here, we will export a quantitation report and filter for peptides that are within factor of 2 of expected fold change between samples A and E. To get quantitative data from Skyline, open Document Grid (View > Live Reports > Document Grid). Learn more about generating reports from custom reports tutorial.

I used report template, 3Proteome_replicateInfo.skyr, which can be added to report list by going to Reports > Manage Reports > Import and importing the .skyr file. I exported the report .csv and analyzed in R. In R, I generated a list of quantitative peptides, Filtered3ProteomeQuant.csv, and also a priority file with all yeast and E. coli peptide modified sequences. More on this later.

The final step is to filter the Skyline document based on this preliminary analysis. Go to Refine > Accept Peptides, and copy in the peptide modified sequences from Filtered3ProteomeQuant.csv.

You should end up with 3Proteome_28min_GPFDIA_PilotExptRefined.sky.

Summary

All this filtering should get you to the point of having a Skyline document with as many peptides as possible that you might want in your assay. This process took us from 8.6k proteins and 43k peptides IDed in GPF DIA to 4.6k proteins and 14.5k peptides with good peaks to 4.6k proteins and 11.8k peptides that are semi-quantitative. In the next step, we will discuss how to refine and build assay.