Sage Journals: Discover world-class research

Abstract

Biological systems are inherently noisy so that two genetically identical cells in the exact same environment will sometimes behave in dramatically different ways. This imposes a big challenge in building traditional supervised machine learning models that can only predict determined phenotypic variables or categories per specific input condition. Furthermore, biological noise has been proven to play a crucial role in gene regulation mechanisms. The prediction of the average value of a given phenotype is not always sufficient to fully characterize a given biological system. In this study, we develop a deep learning algorithm that can predict the conditional probability distribution of a phenotype of interest with a small number of observations per input condition. We show that the deep neural network automatically generates the probability distributions based on 10 or less noisy measurements for each input condition, with no prior knowledge or assumption of the probability distributions.

Get full access to this article

View all access options for this article.

References

Ilia

, Del Vecchio

. Squaring a circle: To what extent are traditional circuit analogies impeding synthetic biology?. GEN Biotechnol, 2022; 1(2):150–155.

Endy

. Foundations for engineering biology. Nature, 2005; 438(7067):449–453.

Lin

, Liu

, Xu

, et al. Genome-wide determination of on-target and off-target characteristics for RNA-guided DNA methylation by dCas9 methyltransferases. Gigascience, 2018; 7(3):giy011.

Chen

, Zhang

. The genomic landscape of position effects on protein expression level and noise in yeast. Cell Systems, 2016; 2(5):347–354.

Eldar

, Elowitz

. Functional roles for noise in genetic circuits. Nature, 2010; 467(7312):167–173.

Liu

, Beyer

, Aebersold

. On the dependency of cellular protein levels on mRNA abundance. Cell, 2016; 165(3):535–550.

Munsky

, Neuert

, Van Oudenaarden

. Using gene expression noise to understand gene regulation. Science, 2012; 336(6078):183–187.

Raser

, O'shea

. Noise in gene expression: Origins, consequences, and control. Science, 2005; 309(5743):2010–2013.

Tsimring

. Noise in biology. Rep Progr Phys, 2014; 77(2):026601.

10.

Ching

, Himmelstein

, Beaulieu-Jones

, et al. Opportunities and obstacles for deep learning in biology and medicine. J R Soc Interf, 2018; 15(141):20170387.

11.

Tang

, Hoffmann

. Quantifying information of intracellular signaling: Progress with machine learning. Rep Prog Phys, 2022; 85:086602.

12.

Cepeda-Humerez

, Ruess

, Tkačik

. Estimating information in time-varying signals. PLoS Comput Biol, 2019; 15(9):e1007290.

13.

Raza

, Alam

. Recurrent neural network based hybrid model for reconstructing gene regulatory network. Comput Biol Chem, 2016; 64:322–334.

14.

Monti

, Fiorentino

, Milanetti

, et al. Prediction of time series gene expression and structural analysis of gene regulatory networks using recurrent neural networks. Entropy, 2022; 24(2):141.

15.

Gupta

, Gupta

. Dealing with noise problem in machine learning data-sets: A systematic review. Proced Comp Sci, 2019; 161:466–474.

16.

Mettetal

, Muzzey

, Pedraza

, et al. Predicting stochastic gene expression dynamics in single cells. Proc Natl Acad Sci USA, 2006; 103(19):7304–7309.

17.

Swain

, Elowitz

, Siggia

. Intrinsic and extrinsic contributions to stochasticity in gene expression. Proc Natl Acad Sci USA, 2002; 99(20):12795–12800.

18.

Wang

, Fan

, Luo

, et al. Massive computational acceleration by using neural networks to emulate mechanism-based biological models. Nat Commun, 2019; 10(1):1–9.

19.

Blake

. A guide to generating probability distributions with neural networks. 2019. https://medium.com/hal24k-techblog/a-guide-to-generating-probability-distributions-with-neural-networks-ffc4efacd6a4

20.

Kan

, Nagai

, Uesawa

. Evaluation of antibiotic-induced taste and smell disorders using the FDA adverse event reporting system database. Sci Rep, 2021; 11(1):1–10.

21.

Bliss

, Fisher

. Fitting the negative binomial distribution to biological data. Biometrics, 1953; 9(2):176–200.

22.

Hughes

, Mornin

, Biswas

, et al. Quanti.us: A tool for rapid, flexible, crowd-based annotation of images. Nat Methods, 2018; 15(8):587–590.

23.

Richardson

, Mitchell

, Stracquadanio

, et al. Design of a synthetic yeast genome. Science, 2017; 355(6329):1040–1044.

24.

Nielsen

, Keasling

. Engineering cellular metabolism. Cell, 2016; 164(6):1185–1197.

25.

Ajikumar

, Xiao

, Tyo

, et al. Isoprenoid pathway optimization for Taxol precursor overproduction in Escherichia coli. Science, 2010; 330(6000):70–74.

26.

Galanie

, Thodey

, Trenchard

, et al. Complete biosynthesis of opioids in yeast. Science, 2015; 349(6252):1095–1100.

27.

Paddon

, Westfall

, Pitera

, et al. High-level semi-synthetic production of the potent antimalarial artemisinin. Nature, 2013; 496(7446):528–532.

28.

Eichenberger

, Lehka

, Folly

, et al. Metabolic engineering of Saccharomyces cerevisiae for de novo production of dihydrochalcones with known antioxidant, antidiabetic, and sweet tasting properties. Metab Eng, 2017; 39:80–89.

29.

Yang

, Xu

, Yang

. Metabolic and process engineering of Clostridium cellulovorans for biofuel production from cellulose. Metab Eng, 2015; 32:39–48.

30.

Atsumi

, Liao

. Metabolic engineering for advanced biofuels production from Escherichia coli. Curr Opin Biotechnol, 2008; 19(5):414–419.

31.

Peccoud

, Ycart

. Markovian modeling of gene-product synthesis. Theor Popul Biol, 1995; 48(2):222–234.

32.

Taniguchi

, Choi

, Li

G-W

, et al. Quantifying E. coli proteome and transcriptome with single-molecule sensitivity in single cells. Science, 2010; 329(5991):533–538.

33.

Yang

, van Nimwegen

, Zavolan

, et al. Decay rates of human mRNAs: Correlation with functional characteristics and sequence attributes. Genome Res, 2003; 13(8):1863–1872.

34.

Gers

, Schmidhuber

, Cummins

. Learning to forget: Continual prediction with LSTM. Neural Comput, 2000; 12(10):2451–2471.

35.

Hochreiter

, Schmidhuber

. Long short-term memory. Neural Comput, 1997; 9(8):1735–1780.

36.

Cho

, Van Merriënboer

, Bahdanau

, et al. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:14091259; 2014.

37.

Chung

, Gulcehre

, Cho

, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:14123555; 2014.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.05 MB

0.04 MB

0.05 MB

Inferring Conditional Probability Distributions of Noisy Gene Expression from Limited Observations by Deep Learning

Abstract

Get full access to this article

References

Supplementary Material