MacCoss - Carafe

Carafe enables high quality in silico spectral library generation for data-independent acquisition proteomics
Data License: CC BY 4.0 | ProteomeXchange: PXD056793 | doi: https://doi.org/10.6069/dtvg-5w47
  • Organism: Homo sapiens, Saccharomyces cerevisiae
  • Instrument: Orbitrap Fusion Lumos,Orbitrap Exploris 480,Orbitrap Astral
  • SpikeIn: No
  • Keywords: spectral library generation, Carafe, deep learning, DIA
  • Lab head: Michael MacCoss Submitter: Chris Hsu
Abstract
Data-independent acquisition (DIA)-based mass spectrometry is becoming an increasingly popular mass spectrometry acquisition strategy for carrying out quantitative proteomics experiments. Most of the popular DIA search engines make use of in silico generated spectral libraries. However, the generation of high-quality spectral libraries for DIA data analysis remains a challenge, particularly because most such libraries are generated directly from data-dependent acquisition (DDA) data or are from in silico prediction using models trained on DDA data. In this study, we developed Carafe, a tool that generates high-quality experiment-specific in silico spectral libraries by training deep learning models directly on DIA data. We demonstrate the performance of Carafe on a wide range of DIA datasets, where we observe improved fragment ion intensity prediction and peptide detection relative to existing pretrained DDA models.
Sample Description
Human and yeast cell culturing and sample preparation: HeLa S3 cells were cultured at 37 °C and 5% CO2 in Dulbecco’s modified Eagle’s medium (DMEM) supplemented with 4.5 g/L glucose, l-glutamine, 10% fetal bovine serum (FBS), and 0.5% streptomycin/penicillin. Cells were grown to an 80% confluency. At the time of harvest, cells were left attached to plates, rinsed three times quickly with ice-cold PBS then flash-frozen in liquid nitrogen prior to storage at -80 °C. The S288C S. cerevisiae strain was selected for all downstream applications. Yeast were grown on a plate overnight, and single colonies were inoculated into media containing yeast extract peptone dextrose (YEPD). The cultures were grown to OD 0.6 before harvested, pelletted, and frozen at -80 °C until use. Protein from HeLa and yeast samples were prepared for LC-MS using a protocol modified based on magnetic bead based tryptic digestion methods by protein aggregate capture. Briefly the cells were lysed in a buffer for 5 minutes containing 2% SDS, 100mM Tris buffer pH 8.5, and ThermoFisher protease inhibitors. Each sample was briefly sonicated on the Branson probe sonicator for 5 seconds, and protein concentration was determined using the bicinchoninic acid (BCA) method calibrated using bovine serum albumin (Pierce, ThermoFisher Scientific). Samples were then diluted to a final concentration of 1 μg/μL. The diluted sample lysates were reduced in 20 mM dithiothreitol and alkylated in 40 mM iodoacetamide. ReSyn Hydroxyl beads were added to the reduced and alkylated samples at a ratio of 4 μL per 25 μg of protein. Protein aggregation onto the hydroxyl beads was induced by adding acetonitrile to a final concentration of 70%. The bead-bound proteins were further subjected to washes containing three 95% acetonitrile and two 70% ethanol washes. After the final wash, the samples were briefly centrifuged to remove any residual ethanol, and trypsin in 50mM ammonium bicarbonate was added in at a ratio of 33:1 (protein to trypsin) for digestion at 47°C for 3 hours. The resulting sample peptides were eluted off of the beads, dried down by a centrifuge vacuum speedvac, and frozen in the -80°C until further use. Frozen peptide samples were resuspended to a final concentration of 500 ng/μL in 0.1% formic acid prior to mass spectrometry analysis. Phosphoproteome sample preparation: HeLa S3 cells were cultured at 37°C and 5% CO2 in Dulbecco’s modified Eagle’s medium (DMEM) supplemented with 4.5 g/L glucose, l-glutamine, 10% fetal bovine serum (FBS), and 0.5% streptomycin/penicillin. To generate bulk phosphopeptides for method comparisons, cells were grown to 80% confluency, incubated in serum-free medium for 6 h prior to treatment with or without 1 mM pervanadate for 15 min, followed by the addition of 10% FBS for 15 min. At the time of harvest, cells were left attached to plates, rinsed three times quickly with ice-cold PBS then flash-frozen in liquid nitrogen prior to storage at −80°C. Cells were harvested by scraping frozen cells from plates in 8 M urea, 50 mM HEPES, 75 mM NaCl, pH 8.0. Cells were sonicated with six 20 s pulses at 12 W with equal rests in ice. The lysate was clarified by centrifugation at 7197 x g for 25 min at 20°C. The protein concentration was estimated using the bicinchoninic acid method (Pierce, ThermoFisher Scientific). Proteins were reduced with 5 mM dithiothreitol (DTT) for 30 min at 55°C, alkylated with 15 mM iodoacetamide for 15 min at room temperature in the dark, and then quenched with 5 mM DTT for 15 min at room temperature. Protein lysates were diluted five-fold in 50 mM ammonium bicarbonate and digested by trypsin at a final trypsin-to-protein ratio of 1:100 by mass. Proteins were digested at 37°C for 15 hours with mixing. Digests were quenched with 0.5% TFA (pH < 2). Quenched digests were centrifuged at 7000 x g for 5 min at room temperature to remove precipitates. Peptides were desalted on Waters SEP-PAK C18 cartridges. Briefly, columns were activated by the sequential addition of 1 column volume (CV) of methanol; 3 CVs of 100% acetonitrile (ACN); 1 CV of 70% ACN, 0.25% acetic acid (AA); 1 CV of 40% ACN, 0.5% AA; 3 CV 0.1% TFA. Acidified digests were then loaded followed by reload of the flowthrough. The column was washed with 3 CV of 0.1% TFA and 1 CV of 0.5% AA. Peptides were eluted with 0.75 CV of 40% ACN, 0.5% AA followed by 0.5 CV of 70% ACN, 0.25% AA. Peptides were dried by vacuum centrifugation and stored at −20°C until enrichment. Phosphotyrosine containing peptides were initially depleted by phosphotyrosine-specific enrichment. Remaining peptides from flow throughs were desalted, dried, and stored at −20°C for global phosphopeptide enrichment. Desalted peptides were resuspended in 80% ACN, 0.1% TFA at 900 μL per 250 μg peptides. Precipitates were removed by centrifugation at 21000 x g for 5 min at 4°C and peptides were added to the 96-well plate for R2-P2 as described previously except with the following modifications: peptide binding was performed in a deep well plate in 900 μL and phosphopeptides were eluted in 100 μL instead of 50 μL of 2.5% ammonium hydroxide, 50% ACN followed by acidification with 60 μL of 10% formic acid, 75% ACN. The optional filtering step was performed in which eluates were passed through two layers of C8 filter material in a 200 μL pipette tip. Peptides were dried by vacuum centrifugation then resuspended in 4% formic acid, 3% acetonitrile for mass spectrometry measurement. Metaproteomics sample preparation: The marine microbiome sample for metaproteomic analysis was collected June 4, 2021 at 1:00 pm PDT in East Sound, WA. The sample filter (0.22 μm 47mm polyethersulfone) containing the bacterial fraction of the water column (0.22 - 1.0 μm) was processed using mechanical lysis in 100 μL 5% SDS solution followed by three subsequent rinses of the filter with 100 μL nanopure water. The resulting 400 μL whole cell solution was collected in microfuge tubes and sonicated (Branson 250 Sonifier; 20 kHz, 30 × 10 s on ice). Samples were then evaporated using a SpeedVac to a final concentration of 5% SDS in 100 μL. Enolase (0.16 μL of 100 ng/μL enolase per 1 μg protein) was added to the sample at the start of the S-trap protocol to ensure proper sample digestion. The sample (20 μg protein) was treated with benzonase (0.5 μL of 250 unit/μL for 10 minutes at 95°C), reduced with 20 mM dithiothreitol for 10 minutes at 60°C and 5 minute cool down to room temperature, alkylated with 40 mM iodoacetamide for 30 minutes in the dark, acidified to pH < 2 (1.2% aqueous phosphoric acid), and then processed on an S-trap column, according to manufacturer’s recommendations. Proteins were digested with Promega modified trypsin (2 μg for 1:10 ratio, 4 hours 37°C). Purified peptides were evaporated to dryness and resuspended in 2% acetonitrile (ACN), 0.1% formic acid with final concentration of 0.5 μg protein/μL.
Created on 10/14/24, 4:29 PM
Clustergrammer Heatmap
 
Download
Exploris480_16mz_unstaggered_phosphoDIA_carafe_2024-10-11_16-28-12.sky.zip2024-10-14 16:04:392,1348,6889,783145,0301
Lumos_8mz_staggered_yeast_EncyclopeDIA_carafe_2024-10-11_15-06-52.sky.zip2024-10-14 16:04:394,35844,14044,140353,0693
Astral_2mz_yeast_EncyclopeDIA_carafe_2024-10-11_14-35-53.sky.zip2024-10-14 16:04:394,96879,90079,900639,1554
Exploris480_8mz_staggered_metaproteome_carafe_2024-10-11_00-29-21.sky.zip2024-10-14 16:04:396,05811,27812,034147,7263
Lumos_8mz_staggered_reCID_yeast_carafe_2024-10-10_23-53-17.sky.zip2024-10-14 16:04:393,92236,25842,354510,9751
Lumos_8mz_staggered_reCID_human_2024-10-10_23-35-17.sky.zip2024-10-14 16:04:394,98831,45035,438525,7411