JBEI - Beller DBTL-ML manuscript

Clustergrammer Heatmap
Flag FileDownloadCreatedProteinsPeptidesPrecursorsTransitionsReplicates
20190110 - Ajinomoto Complete_2019-01-10_13-31-48.sky.zip (23 MB)2019-01-1163434161238
Lessons from Two Design-Build-Test-Learn Cycles of Dodecanol Production in Escherichia coli Aided by Machine Learning

  • Organism: Escherichia coli
  • Instrument: 6460 Triple Quadrupole LC/MS
  • SpikeIn: No
  • Keywords: Synthetic biology, Metabolic engineering, Protoemics, Metabolomics, Machine Learning
  • Lab head: Chris Petzold
Abstract
The Design–Build–Test–Learn (DBTL) cycle, facilitated by exponentially improving capabilities in synthetic biology, is an increasingly adopted metabolic engineering framework that represents a more systematic and efficient approach to strain development than historical efforts in biofuels and bio-based products. Here, we report on implementation of two DBTL cycles to optimize 1-dodecanol production from glucose using 60 engineered E. coli MG1655 strains. The first DBTL cycle employed a simple strategy to learn efficiently from a relatively small number of strains (36), wherein only the choice of ribosome-binding sites and an acyl-ACP/acyl-CoA reductase were modulated in a single pathway operon including genes encoding a thioesterase (UcFatB1), an acyl-ACP/acyl-CoA reductase (Maqu_2507, Maqu_2220, or Acr1), and an acyl-CoA synthetase (FadD). Measured variables included concentrations of dodecanol and all proteins in the engineered pathway. We used the data produced in the first DBTL cycle to train several machine-learning algorithms and to suggest protein profiles for the second DBTL cycle that would increase production. These strategies resulted in a 21% increase in dodecanol titer in Cycle 2 (up to 0.83 g/L, which is more than 6-fold greater than previously reported batch values for minimal medium). Beyond specific lessons learned about optimizing dodecanol titer in E. coli, this study had findings of broader relevance across synthetic biology applications, such as the importance of sequencing checks on plasmids in production strains as well as in cloning strains, and the critical need for more accurate protein expression predictive tools.
Experiment Description
In this study, we aimed to leverage the DBTL cycle and make a more systematic assessment of various enzyme combinations and expression strength to optimize E. coli for dodecanol production. We report on implementation of two DBTL cycles to optimize dodecanol production from glucose using 60 engineered E. coli MG1655 strains. The first DBTL cycle employed a simple strategy to learn efficiently from a relatively small number of strains (36), wherein only the choice of RBSs and an acyl-ACP/acyl-CoA reductase were modulated in a single pathway operon including genes encoding a thioesterase (UcFatB1), an acyl-ACP/acyl-CoA reductase, and an acyl-CoA synthetase (FadD). Measured variables included dodecanol and all proteins in the engineered pathway, which allowed for assessment of the accuracy of RBS strength calculation and the relationship of dodecanol titer to the ensemble composition of pathway proteins. We used the data produced in the first DBTL cycle to train several machine-learning algorithms and to suggest protein profiles for the second DBTL cycle that should increase production.
Sample Description
The design strategy for the 36 Cycle-1 strains was combinatorial and modulated between use of three acyl-CoA / acyl-ACP reductases (Maqu_2507, Maqu_2220, or Acr1) as well as different RBS strengths, determined with RBS calculation software19-21, for the pathway proteins (Figure 1B). The aim of the design was to have a small number of variables, yet exert sufficient control over key enzymes catalyzing the conversion of acyl-ACPs to dodecanol to effectively inform the machine-learning algorithms. The models trained using Cycle 1 dodecanol and proteomic data suggested different optimization strategies for strains utilizing the Maqu_2507 and Maqu_2220 reductases. These design strategies were attempts to address the key protein expression targets specified by the models while still taking into account that resource constraints only allowed for 24 total strains in Cycle 2.
Created on 1/11/19, 3:39 PM