New approach to accelerate bioengineering

Scientists use a technique to automatically predict the amount of biofuel produced by microbes.

A new approach developed by Zak Costello (left) and Hector Garcia Martin brings the the speed and analytic power of machine learning to bioengineering. (Credit: Marilyn Chung, Berkeley Lab)
A new approach developed by Zak Costello (left) and Hector Garcia Martin brings the the speed and analytic power of machine learning to bioengineering. (Credit: Marilyn Chung, Berkeley Lab)

Scientists from the Department of Energy’s Lawrence Berkeley National Laboratory have devised a new machine learning approach to accelerate the design of microbes that produce biofuel. This way is much faster than the current way to predict the behavior of pathways and promises to speed up the development of biomolecules for many applications.

The approach works on an algorithm that involves abundant data about the proteins and metabolites in a biofuel-producing microbial pathway. It then uses data from previous experiments to learn how the pathway will behave.

Scientists used this technique to automatically predict the amount of biofuel produced by pathways that have been added to E. coli bacterial cells.

In biology, a pathway is a sequence of chemical reactions in a cell that delivers a particular compound. Scientists are investigating approaches to re-design pathways and import them starting with one microbe then onto the next, to tackle nature’s toolkit to enhance pharmaceutical, energy, assembling, and agriculture.

Hector Garcia Martin, group lead at the DOE Agile BioFoundry and director of Quantitative Metabolic Modeling at the Joint BioEnergy Institute (JBEI), a DOE Bioenergy Research Center funded by DOE’s Office of Science and led by Berkeley Lab said, “But there’s a significant bottleneck in the development process.”

“It’s very difficult to predict how a pathway will behave when it’s re-engineered. Trouble-shooting takes up 99% of our time. Our approach could significantly shorten this step and become a new way to guide bioengineering efforts.”

Predicting a pathway’s mechanics requires a maze of differential equations that describe how the components in the system change over time. Subject-area specialists build up these “kinetic models” over several months.

Machine learning, however, utilizes information to prepare a computer algorithm to make predictions. The algorithm learns a system’s behavior by analyzing data from related systems. This allows scientists to quickly predict the function of a pathway even if its mechanisms are poorly understood — as long as there are enough data to work with.

Scientists tested their technique on pathways which involves e-coli. One pathway is intended to deliver a bio-based jet fuel called limonene; alternate creates a gas substitution called isopentenol. Past examinations at JBEI yielded a trove of information identified with how extraordinary variants of the pathways work in different E. coli strains.

Some of the strains have a pathway that produces small amounts of either limonene or isopentenol, while other strains have a version that produces large amounts of the biofuels.

The researchers fed this data into their algorithm. Then machine learning took over: The algorithm taught itself how the concentrations of metabolites in these pathways change over time, and how much biofuel the pathways produce.

The algorithm utilized this information to anticipate the conduct of a third set of pathways that it had never observed. It precisely anticipated the biofuel-creation profiles for the mystery pathways, including that the pathways deliver a medium measure of fuel. Furthermore, the machine learning-determined forecast beat motor models.

Garcia Martin said, “And the more data we added, the more accurate the predictions became. This approach could expedite the time it takes to design new biomolecules. A project that today takes ten years and a team of experts could someday be handled by a summer student.”

The research was published May 29 in the journal Nature Systems Biology and Applications.