Data Validation for ProteomeXchange

2024-04-18


Proteomic datasets submitted to Panorama Public can be assigned a ProteomeXchange ID if they fulfill the ProteomeXchange data guidelines. To get a ProteomeXchange ID all the raw data files imported into the Skyline documents must be uploaded. This is the minimum requirement for getting a ProteomeXchange ID. See Upload Raw Data for details on how to upload raw data. In addition to the raw data the following are also required for a "complete" ProteomeXchange submission:
  • All modifications used in the Skyline documents must have Unimod Ids
  • If the Skyline documents include spectral libraries then all the source files (raw + search results) used to build the libraries must be uploaded

When a dataset is submitted, it is validated for a ProteomeXchange submission. The validation process runs as a pipeline job, and the results are displayed after the job is complete. The summary panel at the top of the page displays the validation status and any problems that were found during data validation.


You will see this status message if the data is valid for a "complete" ProteomeXchange submission. This means:
  • All the raw data files used with the Skyline documents were uploaded
  • All the modifications used with the Skyline documents had Unimod Ids OR there were no modifications
  • The source files (spectrum files + search results files) used to build the spectral libraries were uploaded OR there were no spectral libraries



You will see this status message if
  • All the raw data files used with the Skyline documents were uploaded
  • BUT, one or more modifications in the Skyline documents did not have a Unimod Id OR the source files used to build one or more spectral libraries were not uploaded
The data can be assigned a ProteomeXchange ID. However, it will be marked as "supported by repository but incomplete data and/or metadata" when it is announced on ProteomeXchange.


You will see this status message if one more raw files used with the Skyline documents were not uploaded. The data cannot be assigned a ProteomeXchange ID but it can still be submitted to Panorama Public by clicking the Submit Without a ProteomeXchange ID button.


Validation details for Skyline documents (sample files), modifications and spectral libraries are displayed in the tables below the validation summary panel.

Sample file validation table

The Skyline Document Sample Files table displays the status for each Skyline document in the dataset. If all the raw data files imported into the document were uploaded, the status displayed is Complete. Otherwise, the status displayed is Incomplete.


The nodes in the table can be expanded to view the list of replicates and sample files in each document.



Modifications validation table

The Modifications table displays a list of modifications found in the Skyline documents. For a "complete" ProteomeXchange submission all the modifications must have a Unimod Id. Skyline supports an extensive set of Unimod modifications, and it is recommended that you use modifications from the built-in list of Unimod modifications rather than define your own custom modifications. If you are unable to find your desired modification in Skyline then please post to the Skyline or Panorama support boards.
The modifications table will display Missing in the Unimod Match column if the modification in the Skyline document did not have a Unimod Id. Otherwise, the Unimod Id is displayed. Clicking on a Unimod Id will take you to the modification page on the Unimod website.


Expand the nodes by clicking the '+' icon in the first column to view the Skyline documents in the dataset that use the modification along with a link to the peptides in the document with the modification.



If any of the modifications did not have a Unimod Id you can click the Find Match link. This will attempt to find a Unimod match for the modification based on the modification formula and the modified amino acids and / or terminus information found in the modification definition in the Skyline document. Read more about finding Unimod matches on this page: Finding Unimod Matches.

Spectral library validation table

The Spectral Libraries table has a row for each spectral library used with the Skyline documents in the dataset. The library status, in the Status column is Complete if the library is supported and all the source files used to build the library were uploaded. Otherwise the status is Incomplete. Each row can be expanded by clicking the '+' icon in the first column to view a list of source files for the library along with links to Skyline documents that use the library.