Barcoding Manuscript

MacCoss - Barcoding Manuscript

Massively parallel assessment of designed protein solution properties using mass spectrometry and peptide barcoding



David Feldman*1,2, Jeremiah N. Sims*1,3,4, Xinting Li1,2, Richard Johnson5, Stacey Gerben1,2, David E. Kim1,2, Christian Richardson1,6, Brian Koepnick1,2, Helen Eisenach1,2, Derrick R. Hicks1.2, Erin C. Yang1,2, Basile I. M. Wicky1,2, Lukas F. Milles1,2, Asim K. Bera1,2, Alex Kang1,2, Evans Brackenbrough1,2, Emily Joyce1,2, Banumathi Sankaran7, Joshua M. Lubner1,2, Inna Goreshnik1,2, Dionne Vafeados1,2, Aza Allen1,2, Lance Stewart1,2, Michael J. MacCoss5, David Baker1,2,8


  1. Institute for Protein Design, University of Washington, Seattle, WA 98105, USA

  2. Department of Biochemistry, University of Washington, Seattle, WA 98105, USA

  3. Department of Molecular & Cellular Biology, University of Washington, Seattle, WA 98105, USA

  4. Medical Scientist Training Program, University of Washington, Seattle, WA 98105, USA

  5. Department of Genome Sciences, University of Washington, Seattle, WA 98105, USA

  6. Department of Bioengineering, University of Washington, Seattle, Washington 98105, United States

  7. Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.

  8. Howard Hughes Medical Institute, University of Washington, Seattle, WA 98105, USA


* These authors contributed equally to this work


Abstract 

Library screening and selection methods can determine the binding activities of individual members of large protein libraries given a physical link between protein and nucleotide sequence, which enables identification of functional molecules by DNA sequencing. However, the solution properties of individual protein molecules cannot be probed using such approaches because they are completely altered by DNA attachment. Mass spectrometry enables parallel evaluation of protein properties amenable to physical fractionation such as solubility and oligomeric state, but current approaches are limited to libraries of 1,000 or fewer proteins. Here, we improved mass spectrometry barcoding by co-synthesizing proteins with barcodes optimized to be highly multiplexable and minimally perturbative, scaling to libraries of >5,000 proteins. We use these barcodes together with mass spectrometry to assay the solution behavior of libraries of de novo-designed monomeric scaffolds, oligomers, binding proteins and nanocages, rapidly identifying design failure modes and successes.