Abstract
In order to investigate the possible mechanisms for eve stripe formation of Drosophila embryo, a spatio-temporal gene/protein interaction network model is proposed to mimic dynamic behaviors of protein synthesis, protein decay, mRNA decay, protein diffusion, transcription regulations and autoregulation to analyze the interplay of genes and proteins at different compartments in early embryogenesis. In this study, we use the maximum likelihood (ML) method to identify the stochastic 3-D Embryo Space-Time (3-DEST) dynamic model for gene/protein interaction network via 3-D mRNA and protein expression data and then use the Akaike Information Criterion (AIC) to prune the gene/protein interaction network. The identified gene/protein interaction network allows us not only to analyze the dynamic interplay of genes and proteins on the border of eve stripes but also to infer that eve stripes are established and maintained by network motifs built by the cooperation between transcription regulations and diffusion mechanisms in early embryogenesis. Literature reference with the wet experiments of gene mutations provides a clue for validating the identified network. The proposed spatio-temporal dynamic model can be extended to gene/protein network construction of different biological phenotypes, which depend on compartments, e.g. postnatal stem/progenitor cell differentiation.
Keywords
Introduction
An early embryonic stage in Drosophila embryogenesis, i.e. the syncytial blastoderm stage, is completed two hours after the onset of fertilization and periodic segments are then characterized. Before the determination of periodic segments, the embryo is not yet separated by membranes, and macromolecules such as transcription factors (TFs) can diffuse freely and regulate downstream target genes in neighboring nucleus. Hence, at the syncytial blastoderm stage diffusion mechanism is fast enough to vary the concentrations of TFs in transcription regulations. Through a series of high/low affinity bindings of transcription regulations, downstream genes are dictated to express in their corresponding space of an embryo. Therefore, we assume that the transcription regulation and diffusion mechanism may play a cooperative role in characterizing embryonic segments.
Although some topics about protein diffusion have been well studied,1,2 gradient dynamics of concentrations of transcription factors is still hard to be analyzed without quantitative inference under dynamic modeling. For example, critical boundaries settled by protein concentration gradient in dynamic models of early embryogenesis have allowed investigators to re-examine quantitatively concentration gradient dynamics. 3 Jaeger and his colleagues have used mRNA spatial-temporal data and dynamic model to characterize the establishment of gap domains. 4 Therefore, in order to analyze the diffusion mechanisms of transcription factors at different domains of Drosophila embryo, a spatio-temporal model is needed. In recent studies, early embryogenesis in Drosophila includes at least 31 genes in subdividing the embryonic patterns into 14 segmental primordia along the anterior-posterior (A-P) axis. 5 In the past several decades, the spatio-temporal expressions of the early development-related genes (bicoid (bcd), caudal (cad), hunchback (hb), giant (gt), knirps (kni), Krüppel (Kr), tailless (tll), even-skipped (eve), fushitarazu (ftz) hairy, odd-skipped (odd), paired (prd), runt and sloppy-paired (sip)) have been provided and studied during the early developmental stages of Drosophila melanogaster. The 14 early development-related genes can be roughly divided into three classes, i.e. maternal genes, gap genes and pair-rule genes, which have been regarded as hierarchical transcription regulations with positive auto-regulations to generate and refine the constitutions of segments.6,7 At the beginning of early embryogenesis, gap genes are regulated by high-level expressions of maternal TFs to initiate an early embryo development. Gene expression boundaries are determined by thresholds of protein concentration, while gene expression borders are refined by autoregulation and repression.8,9
Three classified genes (i.e. maternal genes, gap genes and pair-rule genes) into which the 14 early development-related genes can be divided are described in detail in the following. The maternal genes, i.e. bcd, cad and hb, diffuse and regulate gap genes with different expression levels in each spatial region along the A-P axis of the Drosophila embryo. The gap genes, i.e. gt, hb, kni, Kr and tll, define roughly the differences between two neighboring stripes by protein diffusion. The pair-rule genes, i.e. eve, ftz, hairy, odd, prd, runt and slp, define periodic patterns of the embryo by transcription regulation and protein diffusion. Two of these pair-rule genes, i.e. eve and ftz, are involved in defining even and odd segments of the 14 segmental primordia along the A-P axis.10,11 The odd and even segments of concern are the seven eve stripes and seven ftz stripes, respectively. Moreover, at the blastoderm stage, along the D-V axis, three main regions, i.e. non-neural ectoderm (prospective epidermis), neurectoderm (prospective nervous system and larval ventral epidermis) and mesoderm (prospective muscle and connective tissue) are also divided. 12 The genes, which determine the three primary regions along the D-V axis, are different from these 14 early development-related genes which determine periodic segments along the A-P axis. In this study, for the convenience of analysis and system identification we will define spatial regions in the two-dimensional (2-D) space of the embryo along the A-P and D-V axes according to the above information. However, we only analyze the A-P formation of embryo after system modeling of transcriptional regulatory network, and the D-V formation can be analyzed by a similar procedure.
At the early developmental stages of Drosophila, the three-dimensional (3-D) spatio-temporal expression data of 14 early development-related proteins (http://flyex.ams.sunysb.edu/flyex/), 13–16 genome-wide mRNA time-course expression data 17 and mRNA 3-D spatio-temporal expression data (http://flyex.ams.sunysb.edu/lab/gaps.html)4,6 have been published and can be used for a system dynamic modeling of early Drosophila development. Interestingly, by comparing the normalized protein spatiotemporal expression data with mRNA spatio-temporal expression data, the trends of gene expressions along the A-P axis are found. 3 In this study, we incorporate the mRNA 3-D data with protein 3-D data to construct the gene/protein interaction network for the transcription regulations and diffusion mechanisms of early embryogenesis via our stochastic 3-D dynamic model. However, there are some expression data within the 14 early development-related genes are unavailable in mRNA 3-D spatio-temporal expression database. In recent studies, Neural-Network (NN) model, which could be trained to optimize its internal network to learn the behaviors of complex systems, has been used to not only infer gene network regulatory relationships based on genome-wide microarray data 18 but also build the relationship between input and output information by using a back-propagation algorithm to learn from the training data.19–22 Therefore, for the unavailable mRNA data, we will use the back-propagation NN training method to obtain the mimic mRNA data according to the available protein and mRNA 3-D data.
In recent years, since the development of experimental techniques has increased the quality and amount of available mRNA and protein expressions, many approaches, e.g. fuzzy logic,23, 24 recurrent neural networks,25–27 Bayesian networks,28,29 Boolean networks30,31 and differential equations,32–34 have been widely exploited to unravel regulation networks from the perspective of systems biological. For the well available protein spatio-temporal data in early Drosophila development, nonlinear 2-D dynamic models have been employed to analyze the transcription regulation properties and effect of gap genes on eve stripe formation.3,6,16,35–38 However, more efforts are needed to incorporate these pathways and gene networks with a spatio-temporal gene/protein interaction network to interpret the dynamic system behavior in early Drosophila development since not only protein but also mRNA 3-D spatio-temporal data are both available for dynamic interplay of genes and proteins at different compartments of Drosophila embryo in early embryogenesis. The mechanisms of early Drosophila development in the whole embryo can be unraveled clearly if the dynamic interactions of genes and proteins are considered at different compartments in early embryogenesis. Therefore, in this study, we propose a stochastic 3-D dynamic model for constructing the gene/protein interaction network of early Drosophila development.
In this study, we focus on the topic of investigating the possible mechanisms for the eve stripe formation of Drosophila embryo. In this biological development approach, it is assumed that transcription regulations consist of cis-effect and trans-effect. Since edges, i.e. transcription regulations, in a gene regulatory network must be constantly selected in order to survive randomization forces, trans-effects, which are the binding affinities of specific transcription factors to cis-regulatory regions in the promoter of the target gene, would be varied rapidly while cis-effects, which are regulated directly by physical attachment of TF's binding cis-regulatory regions, are relatively fixed. 39 Thus, we assume that regulation abilities, i.e. trans-effects, should vary with different spatial regions of the embryo, which results from different binding affinities of diffusible TFs. Based on the constructed stochastic 3-D Embryo Space-Time model (stochastic 3-DEST model), we analyze the transcription regulations and diffusion mechanisms for gene/protein interaction network. The stochastic 3-DEST model with 28 state variables is employed to represent the transcription/translation regulation process between 14 mRNA genes and the corresponding TFs in early embryogenesis. Moreover, because we consider both the environmental noises and the intrinsic noises in mRNA and protein data, stochastic partial differential equations (PDEs) are employed for the transcriptional and translational regulatory model of early embryogenesis. In order to understand the roles of TFs in each spatial region, according to the signs of diffusion parameters of the stochastic 3-DEST model, a TF can be considered as a donor (>0) or an acceptor (<0) in each spatial region to balance instant concentrations of the whole embryo. Hence, the TF in a spatial region that diffuses to (from) the neighboring spatial regions, is called a donor (acceptor). In addition, from previous studies we know that transcription regulations can be inferred by a dynamic model via microarray data.33,36,40 However, how to sieve out the insignificant transcription regulations from the whole gene/protein interaction network is still a problem. For this reason, according to the stochastic 3-DEST model, the Akaike Information Criterion (AIC) 41 for model order detection combined with the maximum likelihood (ML) for parameter estimation in system identification is used in this study to detect significant upstream regulators and to prune insignificant transcription regulations for refining the gene/protein interaction network of early Drosophila development. From the identified stochastic 3-DEST model, we can not only find the significant transcription regulations of the corresponding TFs, which control the anterior/posterior border formation of eve stripes, but also validate these results with wet experiments. In order to validate the identified effect of transcription regulation and diffusion on early Drosophila development, the wet experiments, i.e. gene mutations,7,9,10,42–46 regulatory module classification 47 and cis-regulatory module detection, 48 have been employed to trace back the direct or indirect transcription regulations and protein diffusions in early Drosophila development. From the perspective of the network motifs of the identified gene/protein interaction network in the embryo, we find that transcription regulations and protein diffusion mechanisms may play a cooperative role in the formation of eve stripes in early Drosophila development.
Methods
System modeling and identification for gene/protein interaction network
To identify the dynamic behavior of the early development-related genes, the procedure of system identification in early embryogenesis is divided into four steps. First, utilizing fully the well-published spatio-temporal data and the prior knowledge of early embryogenesis, we construct a stochastic 3-DEST model to identify the molecular dynamics of gene/protein interaction network in early embryogenesis. Second, for system modeling, we use Eve's spatial expression at the cleavage cycle 14A temporal class 8 (c14A8) of the nuclear cleavage to settle stripe boundaries and region boundaries of each stripe for dividing the embryo into seven eve stripes along the A-P axis and into three spatial regions (i.e. anterior part, middle part and posterior part) along the D-V axis, respectively. Third, for the early development-related genes, since a part of the mRNA spatio-temporal data are unavailable, we incorporate the available mRNA and protein spatio-temporal expression data with the back-propagating NN training method to train and simulate the mimic data for the unavailable mRNA spatio-temporal expression data (see Appendix I). Fourth, we identify the model parameters and select the significant regulatory parameters for the stochastic 3-DEST model to construct the transcriptional regulatory network in every spatial region by the ML estimation method and the AIC backward elimination method, respectively. Finally, the transcriptional regulatory networks in every spatial region are connected together to construct the entire spatiotemporal gene/protein interaction network for early Drosophila development.
Remark
If the information of cooperation bindings is richer in future, the transcriptional regulations due to cooperation binding can be easily extended to the regulation candidates of the 3-DEST model, which can improve the proposed model of gene/protein network but with increased computation burden when using the AIC method in early embryogenesis.
Stochastic PDEs model in eve stripe formation
In previous studies, dynamic models with protein synthesis, protein diffusion and protein decay have been utilized in the description of the mechanism of embryonic development.3,4,6,35–38 To analyze the dynamic interplay of genes and proteins in early embryogenesis, six stochastic molecular dynamics are incorporated in the 3-DEST model, i.e. (1) protein synthesis, (2) protein decay, (3) mRNA decay, (4) protein diffusion, (5) transcription regulations, and (6) autoregulation. In addition, in order to differentiate mRNA expressions from protein expressions, we define two state variables Xi and Yi to represent the 3-D spatio-temporal mRNA profiles of the ith target gene and its corresponding TFs, respectively. According to the transcription regulation model proposed in previous studies,6,33,36,40 the stochastic 3-DEST model for the ith target gene and their upstream regulatory TFs in the gene/protein interaction network of Drosophila development is proposed as follows:
where Xi(t, x, y) represents the mRNA expression of the ith target gene, Y(t, x, y) denotes the expression of the jth TF of the target gene, and f(Yj(t, x, y)) defined as f(Y) = Yn/(Pn + Yn) is a sigmoid function to denote the regulatory bindings of TFs on the promoters of targets.39,49,50 Here, P is defined as the means of protein expressions, which imply cis-effects of transcription regulations. The term
Remark
The dynamic model in Eq. (1) is to interpret the transcription/translation regulation processes of 14 genes in early embryogenesis. The first Equation of Eq. (1) describes the transcription regulation of the ith gene; and the mRNA productive rate is mainly due to the transcription regulations of 14 proteins (i.e. TFs), the influence of basal level and degradation of mRNA. The noise vi(t, x, y) denotes the fluctuation of basal level, measurement noise and modeling residue. Since the expression levels of TFs can be altered with different spatial regions of the whole embryo by diffusion mechanism, the relationship of transcription regulation between one TF and its target gene is also different in different spatial regions. The second equation of Eq. (1) describes protein production in the translational diffusion process at the location (x, y). The protein productive rate is mainly influenced by the translation of mRNA, diffusion from the neighboring space, and degradation rate of the protein. The noise ζj(t, x, y) is due to the fluctuation of the basal level of protein, measurement noise and modeling error. The model in Eq. (1) describes the interplay of gene/protein interactions at the location (x, y). The parameters of the stochastic spatio-temporal dynamic model in Eq. (1) can be estimated by the spatio-temporal profile of mRNA data and protein data in each spatial region. The regulatory gene/protein network can be linked gene by gene through the transcription regulations
In the proposed stochastic dynamic model in Eq. (1), the interplay of six stochastic processes, i.e. protein synthesis, protein decay, mRNA decay, protein diffusion, transcription regulations and autoregulation, mimics the dynamics in early embryogenesis. We are the first to combine mRNA dynamic equations with protein dynamic equations to mimic the dynamic interaction network of target genes and their regulatory proteins via 3-D mRNA and protein data at different compartments in early Drosophila development. Our main purpose is to infer the possible mechanisms of eve stripe formation by investigating the estimated parameters
with the stability constraints
Specification of 2-D spatial regions of Drosophila embryo
To identify the discrete 3-DEST model in Eq. (2), we have to define (l, m) as the center of the spatial regions of the embryo by specifying the boundaries of the spatial regions. Along the A-P axis, two boundaries of the ith eve stripe are denoted by Bi and Bi + 1, respectively. The boundaries of seven eve stripes along the A-P axis are denoted by {B1, B2, …, B8} (Fig. 1a). Each of the eve stripes along the A-P axis is separated into three parts, and the boundaries of the middle part of the eve stripe i are denoted as Bia and Bip (Fig. 1b). Therefore, there are totally 21 spatial regions (i.e. m = 1, 2, …, 21 in Eq. (2)) within seven eve stripes along the A-P axis, and the 22 boundaries of the 21 spatial regions are specified as follows:

Determination of eve stripe boundaries at c14A8. A) Along the A-P axis and D-V axis, seven eve stripe boundaries {B1, B2, …, B8} and three spatial region boundaries {bh1, bh2, bh3, bh4} are defined, respectively. B) The yellow square frame as shown in (a) is enlarged for the second eve stripe. Nine spatial regions with symbol Rstripe, lk are defined in each stripe.
{B1, B1a, B1p, B2, B2a, B2p, …, B7 B7a, B7p, B8} = {25%, 29%, 32.5%, 35%, 38.26%, 40.44%, 42.17%, 47%, 49%, 50%, 54.5%, 55.5%, 57.5%, 62%, 64%, 67%, 69%, 72%, 75%, 79%, 81%, 85%}.
Additionally, three spatial regions (i.e. l = 1, 2, 3 in Eq. (2)) along the D-V axis are defined with their boundaries {bh1, bh2, bh3, bh4) = {8.54%, 33.67%, 61.40%, 82.25%} (Fig. 1b). For the convenience of illustration, we define a symbol, Rstripe, lk, to be a spatial region of the location (l, k) in the stripe-th eve stripe. The transformation from (l, m) in the whole embryo to (l, k) in the stripe-th eve stripe, i.e. Rstripe, lk, is given by m = k + 3*(stripe-1). For example, (l, k) = (3, 3) in the second eve stripe, i.e. the spatial region R2,33 corresponds to (l, m) = (3, 6) in the whole embryo, with l = 3 and m = 3 + 3 *(2–1) = 6 (Fig. 1b). After the determination of the spatial regions, expression levels of protein and mRNA are interpolated to the determined spatial regions, which will be used for model identification of Eq. (1).
System identification for stochastic 3-DEST gene/protein interaction networks in different spatial regions of Drosophila embryo
When the data points {Xi(k, l, m), Yj(k, l, m)} for i, j ∈ {1, 2, …, 14}, k ∈ {1,2, …, N}, l = {1, 2, 3}, m = {1,…, 21} are ready, the parameters of stochastic 3-DEST model can be estimated using Eq. (2) for gene/protein interaction networks in each spatial region of Drosophila embryo. For the convenience of parameter estimation, Eq. (2) with N data points can be translated into the following linear regression matrix form:
where
Suppose the noise components εi(k, l, m) and δj(k, l, m) are normally distributed, and the noise matrix El, m has an unknown covariance matrix Σ
l, m
to be estimated. Then we use the ML method to solve the parameter estimation problem with the optimum solution
The log-likelihood function for the given M data points in Yl, m, i.e. M = 2·4(N – 1), can be defined as 41
We can estimate the unknown parameters Θ
l, m
and the covariance matrices of noise Σ
l, m
by maximizing the log-likelihood function Ll, m (Θ
l, m
, Σ
l, m
), i.e.
In order to satisfy the following stability constraints
According to the Akaike Information Criterion (AIC) method, we will let bij, l, m = 0 while the transcription regulation between the jth transcription factor and the ith target gene in the spatial region Rstripe, lk is insignificant. We use the AIC to prune some insignificant regulatory parameters of TFs in Eq. (7). The AIC is defined to include both the residual variance in parameter estimation and the model complexity into one statistics for model order detection as 41
where p is the number of reserved parameters in the backward elimination method of the AIC. Regulatory parameters are pruned one by one as p is decreased until the smallest AICl, m in the smaller p is larger than the AICl, m value of the previous pruning step. While the minimum AICl, m is achieved, the most adequate transcription regulations for each target gene could be obtained from the most adequate model order point of view. 53
Data and Materials
In this study, we incorporate two spatial-temporal data, protein data (http://flyex.ams.sunysb.edu/flyex/) 13–16 and mRNA data (http://flyex.ams.sunysb.edu/lab/gaps.html),4,6 into the stochastic 3-DEST model to investigate how the transcription regulations and diffusion mechanisms cooperatively pattern eve stripes in the early embryogenesis of Drosophila. The spatial regions are first defined as shown in Figure 1 by Eve at cleavage cycle 14A and temporal class 8 (c14A8)in the embryo. Subsequently, the NN model combined with the method is trained by the available protein and mRNA data to simulate and mimic the unavailable mRNA data. The training of the NN model by the available data is achieved by minimizing the training error and maximizing the output correlations. Additionally, to avoid overfitting in system identification, we must interpolate the data points to an adequate number. However, over-interpolated data will lose the low-frequency (or long-range) behavior of the development system. Moreover, using different numbers of interpolated data in system identification may also cause significant differences in parameter estimations, especially in the AIC method. Hence, the robustness of the stochastic 3-DEST model will also be tested by different numbers of interpolated data as an assessment to choose an adequate number of interpolation in the sequel, because the robustness principle has been employed to check if a model can work in the real cell and is employed to narrow down the range of models to the few in the modeling procedure of biological networks, i.e. robustness can help theorists identify the correct dynamic model. 39 From the ML parameter estimation method, the dynamic model in early Drosophila development is constructed. Then, we incorporate the AIC method into the identification process to prune the insignificant regulatory parameters and refine the model. This allows us to pick up the TFs, which are the most significant regulators for controlling the downstream genes in the early development of Drosophila.
The real biological systems are always robust. Therefore the model of a biological system should be robust and the robustness is a validation of dynamic models for biological systems.
39
To test the robustness of the 3-DEST model by different number of data points, we interpolate the time-course data from 38 data points (i.e. four times the number of parameters) to 57 data points (i.e. six times the number of parameters), i.e. there are 20 test cases. However, among the 20 test cases only six test cases, which are respectively those with 38, 39, 40, 41, 42 and 44 interpolated data points, meet the model's stability constraints
Robustness tests of parameters,
Results
After the parameters in Eq. (1) are estimated by ML and pruned by the AIC in Eqs. (6)–(8), the identified 3-DEST models for gene/protein interaction networks in the spatial regions of Drosophila embryo are given in the following.
where i, j = 1, 2, …, 14, l = 1, 2, 3, m = 1, 2, …, 21.
After system identification, the simulation results of the system model obtained using the ML estimation method and the AIC method are shown in Figure 3(b) (protein) and 3(d) (mRNA) compared with the original data in Figure 3(a) (protein) and 3(c) (mRNA), respectively. The 3-DEST gene/protein interaction networks in different spatial regions are constructed in Figure 4 through the diffusion coefficients

Normalized mRNA and protein expressions. Solid line and dashed line denote protein and mRNA expressions, respectively. The expressions of knirps (cyan line), krüppel (green line) and giant (black line) are plotted in time profiles.

Original data and estimated results by our proposed dynamic model. The original eve mRNA and protein spatial data at c14A8 are shown in A) and C), respectively. After system identification, the estimated eve mRNA and protein spatial data generated by the dynamic model are shown in B) and D), respectively.

3-DEST dynamic gene/protein interaction network for diffusion and transcriptional regulation mechanisms in different spatial regions in the whole embryo. The notations, R1.11, R1.12, R1.13, R1.21, …, R7.31, R7.32, R7.33, are the 63 spatial regions of the whole embryo which is specified by Figure 1 (a). In each spatial region Rstripe, ij, the colors of the outer ring in the color circle are specified by the 14 gene names, which are given by the color bar below the figure, respectively. Each color of the outer ring is specified by each gene. The solid lines that connect color circles stand for transcription regulation between genes in each spatial region based on regulatory abilities βij of the identified 3-DEST dynamic model in Eq. (9). Positive and negative regulations are denoted by arrows and bars at the end of solid lines, respectively. Additionally, the colors of the inner circle, i.e. the black and white circle, inside the color circle stand for the TFs’ roles, i.e. donor or acceptor of the transcriptional regulation network, respectively. The bold color lines that connect the same genes in neighboring spatial regions with different roles stand for protein diffusions from donor (black inner circle) to acceptor (white inner circle) in neighboring spatial regions based on the diffusion coefficients γj of the identified 3-DEST dynamic model in Eq. (9). The specification of the colors in bold color lines is consistent with the colors in the outer ring of the color circle, which are specified by the color bar. For example (see also Fig. 5 a), Caudal in R4,11 with green color in outer ring and black color in inner circle found regulates ftz (yellow) and runt (navy blue) and plays as a donor, which can diffuse to the neighboring regions. A clearer figure is available online at the website, http://www.ee.nthu.edu.tw/bschen/Drosophila_Fig4.pdf.
Previous research 48 on cis-regulatory module detection shows that the enhancer element of the second eve stripe contains the binding sequences of Krüppel, Giant, Bicoid and Hunchback, and the second eve stripe can be activated by Bicoid, Hunchback, and repressed by Giant and Krüppel (Table 1 and Fig. 4).
In the analysis of eve stripe formation, the boundaries of eve stripe can be affected by diffusion from the neighboring regions where Eve serves as a donor to the regions where it plays the role of an acceptor. Figure 4 shows that Hunchback in R2,22, R3,11, R3,33, R4,31, R6,11, R6,13, and R7,31 and Knirps in R1,12, R4,13, R5,23 and R7,13 positively and negatively regulate eve, respectively, and Eve in these regions simultaneously serves as a donor which diffuses through and affects the boundaries, i.e. stripe 1–2, stripe 3–4, stripe 4–5, stripe 5–6 and stripe 7-terminal (Fig. 4). Therefore, it shows that stripe boundaries are broken in the embryo with hunchback and knirps double mutant and the phenotype is similar to the embryo with a strong eve mutant. 10 In addition, eve in R2,11 and R2,23, which plays the role of donor and is repressed by Giant and Krüppel respectively, would locally affect the anterior and posterior of the second eve stripe, respectively (Fig. 4).7,45,54 Moreover, the effect of Giant and Krüppel respectively on the anterior and posterior of the second eve stripe should be diffusively reinforced by the same repressive transcription regulations in R2,22. In the boundaries of the third eve stripe, eve in R3,31 and R4,31 which is negatively regulated by Hunchback and Knirps respectively, would diffuse to and affect on anterior and posterior boundaries of the third eve stripe, respectively (Fig. 4).7,45,54 Moreover, we find that Giant and Hairy have no effect on the boundaries stripe 4–5 and stripe 5–6, respectively.
From the transcription regulations shown in Figure 4, we believe that most of them are new predicted except those discussed here because most of genetic studies in Drosophila are not easy to find direct transcription regulations without chromatin immunoprecipitation microarray (ChIP-chip) experiments. In this study, we provide a direction for other biologists at the wet experiments of transcription regulations especially in ChIP-chip experiments. For example, according to the robustness tests in Table 1, we show that eve in R2,22 is positively regulated by Ftz and Knirps and is negatively self-regulated. The robust regulations are the most probable suggestion in transcription regulations of eve stripe formation.
Moreover, in the large network, there exist a huge number of interaction patterns. Only a few types of interaction patterns called network motifs, which are embedded in the network and connected to each other, allow them to carry out their functions even in the presence of additional interactions. Mangan and Alon 55 have analyzed two feedforward network motifs, i.e. coherent feedforward loops (C-FFL) and incoherent feedforward loops (I-FFL), and found that C-FFL acted as sign-sensitive delays, and I-FFL acted as sign-sensitive accelerators. 55 Moreover, Han et al 56 propose that a signaling module composed of a C-FFL and an I-FFL causes an early transient response and a delayed prolonged response after a short stimulus. 56 The early transient responses and delayed prolonged responses plausibly depend on post-translation modification of existing proteins and new protein synthesis, respectively. The combinative signaling module is suggested and found in drug therapy. Therefore, we obtain C-FFL and I-FFL from the constructed network (Fig. 5) according to the following rules. One relationship of the transcription regulations in Figure 5 serves as an edge of FFL, when the regulation relationship exists in at least four neighboring regions among its nine neighboring regions. For example (Fig. 5a), a C-FFL C15 found in Figure 5 is composed of three transcription regulations (Caudal->Ftz in R3,13, Runt->Ftz in R3,13 and Caudal->Runt in R4,11) and two diffusions (Caudal and Runt are both diffused from R4,11 to R3,13). In addition, these three regulatory relationships exist respectively in at least four neighboring regions, i.e. Caudal->Ftz in R3,13, R3,12, R4,11 and R4,31, Runt->Ftz in R3,13, R3,12, R3,33 and R4,31 and Caudal->Runt in R4,11, R3,13, R4,12 and R4,31. Therefore, C15 is one of the FFLs (Fig. 5b) found in our network. By the same procedure, not only can we find 25 C-FFLs and 18 I-FFLs (Fig. 5b) but also 13 possible combinative signaling modules among 25 C-FFLs and 18 I-FFLs, i.e. Odd in R1,12, R1,11 (C3 and II in Fig. 5b), R1,13 and R1,23 (C1 and I1), Slp in R2,12, R2,22 (C7 and I5), R3,31 (I12 and C18) and R3,12 (C19 and I4), Eve in R3,12, R3,13 (C14 and I9) and R4,11 (C13 and I9), and Ftz in R3,13 (C21 and I15) and R6,13 (C25 and I17). Among these modules, we find that Hunchback acts as a source of FFLs to activate Ftz as an output expressed in eve stripes 3, 4, 6 and 7. From the embryo with hb– mutants, eve stripes 2, 3, 4 and 7 are partially or completely deleted. 10 Although Ftz in R6,12 and R6,13 is activated respectively by I-FFL and combinative signaling module with Hunchback as an input source, Ftz in R6,12 and R6,13 is respectively negatively regulated and does not regulate eve. Therefore, we suggest that C-FFL, I-FFL and combinative signaling module are respectively important in activating speedy responses in R4,11, R4,12 and R7,11, activating a delayed response in R7,13 with the ability of noise filtering and activating a delayed prolonged response in R3,13.

Coherence and incoherence feedforward loops of 3-DEST dynamic gene/protein interaction network. (A) According to the rule that each of the regulation relationships of FFLs must exist in at least four neighboring spatial regions, parts of gene/protein interaction network (left) in R32, R33, R4.1 and R4.2 are examples of feedforward loops, and can be redrawn as C15 (right). (B) From the above rule, we find the network motifs, i.e. 25 C-FFLs (C1~C25) and 18 I-FFLs (I1-I18), for the cooperation of transcription regulations with diffusions in early embryogenesis. The color bars denote diffusions, which are the same as those in Figure 4 A clearer figure is available online at the website, http://www.ee.nthu.edu.tw/bschen/Drosophila_Fig5.pdf.
Discussion and Conclusion
In this study, we are the first to combine mRNA dynamic equation with protein dynamic equation using spatio-temporal model to construct the gene/protein interaction network to investigate the gene/protein regulatory mechanisms of eve stripe formation in the early development of Drosophila. However, there are still three mechanisms of concern in Drosophila embryogenesis, i.e. protein-protein interactions, translation regulations and epigenetic regulations. In a recent study, protein-protein direct interactions are not found between the 14 early development-related TFs of Drosophila embryo,
57
although there may exist some interactions which require a co-factor(s). For example, Bicoid has self-inhibitory property which requires a co-factor(s), and the binding site at the N-terminal region of Bicoid is evolutionarily conserved.
58
However, the understanding of protein–protein interactions via a co-factor(s) is limited. Moreover, cooperative bindings through sigmoid function have been implicitly concerned in previous models.
38
However, since the prior information of cooperative bindings in early embryogenesis is also limited, cooperative binding is not considered in our model. If the information of cooperative binding is most available, cooperative bindings can be considered easily as regulation candidates in the 3-DEST model, i.e. the cooperation regulation
In early embryogenesis, diffusion mechanism is needed not only for maternal genes but also for gap genes and pair-rule genes to regulate their target genes in the neighboring spatial regions, which can determine the roles of TFs in each region, i.e. donor/acceptor. Without the dynamic space-time model, the dynamics of TFs’ diffusions may not be easily observed from a system point of view, especially in 2-D space. The contributions of this study include the following. (1) Construction of a stochastic 3-DEST dynamic model for gene/protein interaction network which not only contains the concentration-dependent transcriptional abilities but also includes six stochastic processes to mimic the spatio-temporal dynamic interplay among the target genes and their regulatory TFs at the early embryonic stage (i.e. the following six processes (i) protein synthesis, (ii) protein decay, (iii) mRNA decay, (iv) protein diffusion, (v) transcription regulations, and (vi) autoregulation are involved in our dynamic model). (2) Utilization of the AIC to refine the stochastic 3 -DEST dynamic model for gene/protein interaction network via pruning the insignificant transcription regulations in each spatial region. (3) Findings of transcription regulations in the seven eve stripes in the stochastic 3-DEST gene/protein interaction network. (4) Validating of the identified gene/protein interaction network by literature reference with the wet experiments of gene mutations. (5) Inference of transcription regulations and diffusion mechanisms for playing a cooperative role in the creation of FFLs to build eve stripes by speedy responses, delayed responses with the ability of noise filtering and delayed and prolonged responses. For the possible experimental validation of the feedforward loops (FFLs) in 3-DEST dynamic gene/protein interaction network, biologists can follow the similar experimental design in
69
and.9,10,43,45,46 For example, if the two FFLs,
and
are considered, biologists can examine gene
However, one of the weaknesses in system identification is the increase in computation burden due to the use of the AIC method. Because one of our main purposes is to extract the significant transcription/translation regulations via pruning the insignificant transcription/translation regulations by using the AIC method, we use an explicit scheme with some stability constraints on the parameters to construct and then refine the gene/protein interaction network. Additionally, computation complexity will be increased, when the spatial regions are precisely specified. Moreover, a plenty of spatio-temporal data are needed in parameter estimation of the 3-DEST model. Although we know that eve stripes of the Drosophila embryo is probably not just built by the 14 early development-related genes, it is not a problem to estimate a more complicated dynamic regulatory network by the proposed method if much more mRNA and protein data are available in the future.
Disclosures
The authors report no conflicts of interest.
Footnotes
Acknowledgements
We thank Professor Jui-Chou Hsu, Department of Life Science, National Tsing Hua University, for providing helpful comments. The work is supported by the National Science Council of Taiwan under contract No. 98-2221-E-007-113-MY3.
Appendix I: Reconstruction of Unavailable Data by Neural-Network Learning
Because there are some deficiencies in the mRNA data, recovery of these missing data is needed when the data are used for system identification. For the deficiencies of mRNA data, a back-propagation NN training method is employed to reconstruct them. Three classes of genes, i.e. maternal genes, gap genes, and pair-rule genes, into which the 14 early development-related genes are divided, are utilized for the reconstruction of these unmeasured data in their classes of genes, individually. For each data reconstruction process, we individually train and reconstruct these data in each class of genes along the A-P axis, since the protein and mRNA data of each class of genes along the A-P axis are roughly similar (Fig. 2).
3
Note that the downstream class of genes with missing mRNA data is reconstructed by the upstream class of genes via the back-propagation NN training method. The training methods, i.e. Broyden, Fletcher, Goldfard and Shanno (BFGS), Levenberg-Marquardt, Powell-Beale Restarte, Polak-Ribiere, Fletcher-Reeves, and Rprop, have been employed to test the performance of data reconstruction. In this study, NN combined with the BFGS method is used for training and simulating the unmeasured mRNA data,
70
because the NN plus BFGS method has the best performance in our tests (data not shown). In order to obtain an optimal NN training results, we maximize the output correlations and minimize the training errors in the training processes. A few of the unmeasured protein data points are also reconstructed by the same learning and simulating processes. For example, if the mRNA data of gene
Appendix II: Stability of Discrete 3-DEST Model
Appendix III: Procedure of Stability Constrained Estimation
The stability constrained estimation of parameters can be performed by a Matlab function, Isqlin.
A procedure for the parameter estimation is given as follows:
