- Short Report
- Open Access
Inferring Drosophila gap gene regulatory network: pattern analysis of simulated gene expression profiles and stability analysis
BMC Research Notes volume 2, Article number: 256 (2009)
Inference of gene regulatory networks (GRNs) requires accurate data, a method to simulate the expression patterns and an efficient optimization algorithm to estimate the unknown parameters. Using this approach it is possible to obtain alternative circuits without making any a priori assumptions about the interactions, which all simulate the observed patterns. It is important to analyze the properties of the circuits.
We have analyzed the simulated gene expression patterns of previously obtained circuits that describe gap gene dynamics during early Drosophila melanogaster embryogenesis. Using hierarchical clustering we show that amplitude variation and defects observed in the simulated gene expression patterns are linked to similar circuits, which can be grouped. Furthermore, analysis of the long-term dynamics revealed four main dynamical attractors comprising stable patterns and oscillatory patterns. In addition, we also performed a correlation analysis on the parameters showing an intricate correlation pattern.
The analysis demonstrates that the obtained gap gene circuits are not unique showing variable long-term dynamics and highly correlating scattered parameters. Furthermore, although the model can simulate the pattern up to gastrulation and confirms several of the known regulatory interactions, it does not reproduce the transient expression of all gap genes as observed experimentally. We suggest that the shortcomings of the model may be caused by overfitting, incomplete model description and/or missing data.
A biological system that has been extensively studied is the segmentation mechanism of early development in Drosophila melanogaster (see  for review). At early stage, a cascade of maternal and zygotic genes is activated in the syncytial embryo that subdivides the ectoderm into smaller domains. First, maternal morphogenes such as bicoid (bcd), caudal (cad) and hunchback (hb) activate zygotic gap genes such as hb, giant (gt), Krüppel (Kr), knirps (kni), or tailles (tll), which in turn will activate the pair rule genes. The pair rule genes will regulate segment polarity genes and Hox genes, which both control the differentiation of each segment of the future embryo .
The gap gene circuit has been extensively investigated using mathematical models [2, 3]. In all cases, the goal was to derive the regulatory interactions that control gene expression. The gene circuit approach  combined with a parameter optimization method allowed to infer gene regulatory interactions directly from experimental spatio-temporal gene expression data [5, 6]. In all cases the optimization involved minimization of the difference between observed data and simulated data. Previous studies [4, 7, 8] have analyzed the obtained gene circuits essentially by visual inspection of the simulated patterns, mainly because of an insufficient number of circuits. Fomekong et al.  proposed a faster optimization method that yielded a higher number of circuits, allowing for a more detailed analysis. Finding a set of parameters that reproduces the observed data does not necessary imply that the network structure has been identified correctly, or that the underlying pattern formation mechanism of the system has been revealed [9, 10]. For some systems, the network structure itself inherently leads to robust pattern formation and is weakly depended on the specific parameter values [11, 12]. Inference may lead to a unique network, however for many cases many circuits with different topologies and scattered parameter values are found. It is necessary to further analyze these circuits and discriminate between realistic and non-realistic circuits based on other criteria [13, 14].
We have analyzed the simulated patterns and parameters of the circuits that were obtained previously using descriptive statistics and stability analysis [8, 15]. The incompleteness of the available experimental data, the complexity and the non-linearity of the model and the large number of unknown parameters potentially leading to over-fitting makes the reverse engineering problem challenging. It might lead to circuits with different regulatory interactions or variability in the simulated patterns and dynamical behavior.
Although all circuits show relatively good fits with respect to the data (see Figure 1), small features, like bumps, dips and other variations in the expression profiles at gastrulation time are observed (see Additional file 1). These features are not observed in the data, and may represent circuits that are not biologically realistic. We performed a hierarchical cluster analysis on the profiles to identify groups that share deviant features. By statistical comparison of the parameters among the different groups using a T-test we find parameters that may explain the observed features.
Figure 2 shows the clustered simulated profiles where we observe four main pattern groups described as follows: group 1: no defection. group 2: hb showing a dip in the anterior domain. group 3: tll showing a shoulder. group 4: Kr showing an extra bump.
We noticed that some of the clusters share the same circuits as shown in Figure 3. We observe that the group with the hb-anterior dip largely overlaps with the tll bump cluster and also with one of the gt clusters. This means that the features in hb, tll and gt share a common circuit topology (see Venn-diagram Figure S1 in Additional file 1).
The differences between the profiles may be explained by variability in circuit topology or by differences in parameter magnitude. By comparing the parameters of the different groups using a T-test, we find that parameters of group 2 and group 3 do not show any significant differences, and therefore are combined into one group 2-3. Comparison of group 1 and group 2-3 yields three parameters: (Table 1). In Group 2-3 Gt represses hb and causes the anterior hb dip. Also, tll activation by Gt is considerably decreased, leading to higher production of tll, which causes the tll-bump. Comparison of group 1 and 4 shows that Kr autoactivation is strong and repression by Gt is weak within group 4 (Table 2). This combination causes a local increased production of Kr on the domain where Gt is expressed. A consequence of the strong autoactivation would be a higher level of Kr all along the A-P axis. This is prevented by increased repression through Hb and Kni. Also, the weaker production rate of Kr compensates for the strong autoactivation.
Pattern stability at later times
During gastrulation most of the gap domains, maternal bcd and cad disappear within 30 min. The anterior hb domain disappears rapidly during gastrulation , while posterior hb domain can still be detected for a few more hours until the end of germ band extension . Central Kr domain decays rapidly after the onset of gastrulation [18, 19]. Posterior gt domain disappears rapidly during gastrulation while the anterior domains persist for a few hours but change quite drastically and become involved in organ formation [20–22]. The entire kni domain and the posterior domain of tll disappear rapidly after gastrulation [23, 24].
The long-term dynamics of the circuits should show if the model is able to predict the disappearance of the gap gene domains, and provide information about the asymptotic stability of the model and potentially gives its attractors. The parameters were obtained by fitting the model to real data until gastrulation time. To study long term dynamics we simulated all circuits for an extended period (see Methods, Additional file 1 and Additional files 2, 3, 4, 5 where long term dynamic movies are shown). We classified the behavior into the following groups:
1. stable patterns: 64 circuits, where tll and cad domains disappear completely in most cases. This group is composed of three sub-groups
(a) 9 circuits show a rudimentary gap gene pattern with all gene domains more or less well defined. (Figures 4-A, B).
(b) 27 circuits develop an uniform hb domain that covers the whole embryo. (Figures 4-C, D).
(c) 28 circuits show variable stable patterns with expanding or disappearing domains.
2. Oscillatory patterns: (37 circuits) with the two sub-groups:
(a) 18 circuits with cad, hb, gt and tll showing posterior oscillations, while Kr and kni domains disappear. (Figures 4-E, F).
(b) 19 circuits where all genes oscillate but tll, which disappears (Figures 4-G, H).
We have compared parameters using T-tests and the average circuit topology (see Table 3, 4 and 5). From Table 3, we see that the main difference between the two stable patterns is the strong hb autoactivation in the group with expanded hb. In some of the oscillatory circuits (Figure 4-H), we observed a basic motif composed of autoactivation and negative feedback loops. It can be shown theoretically that the minimal requirement for oscillations to occur in a two-gene network is that an activator activates its repressor and also itself. Nevertheless, the positive and negative feedback loops may be indirect and also the actual parameter values may prevent the formation of oscillations even if the minimal requirement for oscillations is present. In the first oscillatory group we observe the basic motif for oscillation between Hb and Gt (Figure 5-H). In the data we observe that the anterior hb peak slightly collapses, however it collapses more at the position of the anterior gt peak. In a number of circuits the fit to Hb is improved by repression of hb by Gt (group 2 cluster analysis). Almost all members of this group show oscillations. Hb in this group has an intermediate autoactivation (the group with strong autoactivation does not show oscillations) and Cad activates hb leading to constitutive activation of posterior hb. Next to the hb-gt oscillatory motif we see similar motifs with Tll. When negative feedback interactions are removed we observe that the corresponding gene does not oscillate any longer. Although the connections in these motifs are weak the behavior at later times is strongly affected.
From the two previous analyses and T-tests, we see that parameters differ from circuit to circuit leading to different behaviour. This might be a sign of overfitting, which can be determined by looking at the correlation matrix of all the parameters (Figure 5). Because of compensation mechanisms parameters may not be identifiable. Examples of these are promoter and decay rates, which both scale the expression profile.
Furthermore, the input weights on a single gene can also compensate each other. If a positive input on a gene becomes stronger, increasing negative weights or decreasing positive weights can adjust for the increased total input, such that the total input on that gene is not altered much. However, these correlation patterns may be more intricate [see Additional file 1 for an extended correlation analysis].
It seems difficult to determine which of the circuits have the "correct" topology. From the clustering of the gastrulation profiles, we could have considered that only circuits without defects should be taken into account, but we see that it is not that trivial since the difference from circuit to circuit is not only based on the regulatory interaction type, but also their strength. None of the circuits predicted the disappearance of the gap genes during gastrulation, this may be related to missing mechanisms like degradation of maternal genes. The long-term dynamics of the circuits show that the patterns converge to four main attractors. This difference in convergence may be explained by differences in a few, but also the presence of certain motifs. More intriguing, one of the attractors resembles the gastrulation pattern and circuits falling into this group have interactions more consistent with experimental evidence. Combined with the well defined parameters obtained from the correlation analysis , the following gap gene interactions consistent with literature were derived:
1. All the gap gene are activated by Cad.
2. All the gap gene but kni are activated by Bcd.
3. Hb, Kr gene have an auto-activation.
4. kni does not have a auto-repression, but certitude on auto-activation can not be deduced (strong correlation coefficient with most parameters)
5. Mutual repression between Hb and Kni
6. Mutual repression between Gt and Kr
These interactions are consistent with the regulatory mechanism proposed in  as well as those obtained in early literature [19-21, 25-29] and previous analysis . Out of all the circuits, only 4 have a very good patterns and the mentioned regulatory interactions [circuits 20, 31, 82 and 101, see Additional file 6]. The alternative interactions proposed by the other circuits are a consequence of overfitting, incomplete data and incomplete model structure.
For example biological evidence suggests that the anterior hb dip is caused by different early and late regulation mechanisms, which is not included in the current model, consequently the optimization predicts for many circuits suppression of hb by Gt to mimic this data feature. Furthermore the model tries to reproduce the experimentally observed decrease of cad by introducing negative feedback through the gap genes.
Jaeger et al.  suggested that the anterior shift of posterior domains after cycle 14A is caused by asymmetric repression of the gap genes. All the current circuits reproduce the shift, but from the current analysis, it seems that the shift is not necessary a consequence of the asymmetric repression triggered by Hb. In many circuits we see that the shift of these domains continues to progress and leads to domain expansion or disappearance of other domains. The shift seems to correlate strongly with the posterior hb domain. The posterior hb domain develops later than the other domains, and represses gt, kni and Kr. In the circuits where the posterior hb domain continues to expand and in the end forms an almost uniform domain at steady state that covers the whole embryo; the anterior gt domain remains and Kr, kni, tll and cad all disappear. This phenomenon is caused by strong hb autoactivation, the other gap genes are not able to balance hb expression. The anterior gt domain remains because of maternal activation by Bcd and weak repression by Hb.
Schroder et al.  suggested that autoactivation is involved in maintenance of gap gene expression and sharpening of gap domain boundaries . Although this might be true, strong autoactivation also affects pattern stability later on during gastrulation, making it more difficult for domains to fade. The inability of the circuits to predict transient expression suggests that either an additional mechanism is missing in the model or that the optimization failed to capture the dynamics.
Levine M: A systems view of Drosophila segmentation. Genome Biol. 2008, 9 (2): 207-10.1186/gb-2008-9-2-207.
Sánchez L, Thieffry D: A logical analysis of the gap gene system. Theor Biol. 2001, 211: 114-141. 10.1006/jtbi.2001.2335.
Reinitz J, Sharp DH: Mechanism of eve stripe formation. Mech Dev. 1995, 49 (1-2): 133-158. 10.1016/0925-4773(94)00310-J.
Jaeger J, Surkova S, Blagov M, Janssens H, Kosman D, Kozlov KN, Myasnikova E, Vanario-Alonso CE, Samsonova M, Sharp DH, Reinitz J: Dynamic control of positional information in the early Drosophila embryo. Nature. 2004, 430 (6997): 368-371. 10.1038/nature02678.
Myasnikova E, Samsonova A, Kozlov K, Samsonova M, Reinitz J: Registration of the expression patterns of Drosophila segmentation genes by two independent methods. Bioinformatics. 2001, 17: 3-12. 10.1093/bioinformatics/17.1.3.
Poustelnikova E, Pisarev A, Blagov M, Samsonova M, Reinitz J: A database for management of gene expression data in situ. Bioinformatics. 2004, 20 (14): 2212-2221. 10.1093/bioinformatics/bth222.
Perkins TJ, Jaeger J, Reinitz J, Glass L: Reverse engineering the gap gene network of Drosophila melanogaster. PLoS Comput Biol. 2006, 2 (5): e51-10.1371/journal.pcbi.0020051.
Fomekong-Nanfack Y, Kaandorp JA, Blom J: Efficient parameter estimation for spatio-temporal models of pattern formation: case study of Drosophila melanogaster. Bioinformatics. 2007, 23 (24): 3356-3363. 10.1093/bioinformatics/btm433.
Gutenkunst RN, Waterfall JJ, Casey FP, Brown KS, Myers CR, Sethna JP: Universally sloppy parameter sensitivities in systems biology models. PLoS Comput Biol. 2007, 3 (10): 1871-1878. 10.1371/journal.pcbi.0030189.
Ashyraliyev M, Jaeger J, Blom JG: Parameter estimation and determinability analysis applied to Drosophila gap gene circuits. BMC Systems Biology. 2008, 2 (83):
George von Dassow EMM Eli Meir, Odell GM: The segment polarity network is a robust developmental module. Nature. 2000, 406: 188-192. 10.1038/35018085.
Jaqaman K, Danuser G: Linking data to models: data regression. Nat Rev Mol Cell Biol. 2006, 7 (11): 813-819. 10.1038/nrm2030.
Dougherty ER: Validation of Inference Procedures for Gene Regulatory Networks. Current Genomics. 2007, 8: 351-359. 10.2174/138920207783406505.
Fomekong-Nanfack Y, Postma M, Kaandorp JA: Inferring Drosophila gap gene regulatory network: a parameter sensitivity and perturbation analysis. BMC Systems Biology. 2009, 3: 94-10.1186/1752-0509-3-94.
Jaeger J, Blagov M, Kosman D, Kozlov KN, Myasnikova E, Surkova S, Vanario-Alonso CE, Samsonova M, Sharp DH, Reinitz J: Dynamical analysis of regulatory interactions in the gap gene system of Drosophila melanogaster. Genetics. 2004, 167 (4): 1721-1737. 10.1534/genetics.104.027334.
Tautz D, Lehmann R, Schnurch H, Schuh R, Seifert E, Kienlin A, Jones K, Jaeckle H: Finger protein of novel structure encoded by hunchback, a second member of the gap class of Drosophila segmentation genes. Nature. 1987, 327: 383-389. 10.1038/327383a0.
Tautz D: Regulation of the Drosophila segmentation gene hunchback by two maternal morphogenetic centres. Nature. 1988, 332 (6161): 281-284. 10.1038/332281a0.
Knipple DC, Seifert E, Rosenberg UB, Preiss A, Jaeckle H: Spatial and temporal patterns of Kruppel gene expression in early Drosophila embryos. Nature. 1985, 317: 40-44. 10.1038/317040a0.
Gaul U, Jackle H: Pole region-dependent repression of the Drosophila gap gene Kruppel by maternal gene products. Cell. 1987, 51 (4): 549-555. 10.1016/0092-8674(87)90124-3.
Mohler J, Eldon ED, Pirrotta V: A novel spatial transcription pattern associated with the segmentation gene, giant, of Drosophila. The EMBO journal. 1989, 8 (5): 1539-1548.
Eldon E, Pirrotta V: Interactions of the Drosophila gap gene giant with maternal and zygotic pattern-forming genes. Development. 1991, 111 (2): 367-378.
Kraut R, Levine M: Mutually repressive interactions between the gap genes giant and Kruppel define middle body regions of the Drosophila embryo. Development. 1991, 111 (2): 611-621.
Rothe M, Nauber U, Jäckle H: Three hormone receptor-like Drosophila genes encode an identical DNA-binding finger. EMBO J. 1989, 8 (10): 3087-94.
Pignoni F, Baldarelli RM, Steingrimsson E, Diaz RJ, Patapoutian A, Merriam JR, Lengyel JA: The Drosophila gene tailless is expressed at the embryonic termini and is a member of the steroid receptor superfamily. Cell. 1990, 62: 151-163. 10.1016/0092-8674(90)90249-E.
Jackle H, Tautz D, Schuh R, Seifert E, Lehmann R: Cross-regulatory interactions among the gap genes of Drosophila. Nature. 1986, 324 (6098): 668-670. 10.1038/324668a0.
Harding K, Levine M: Gap genes define the limits of antennapedia and bithorax gene expression during early development in Drosophila. The EMBO journal. 1988, 7: 205-214.
Struhl G, Johnston P, Lawrence PA: Control of Drosophila body pattern by the hunchback morphogen gradient. Cell. 1992, 69 (2): 237-249. 10.1016/0092-8674(92)90405-2.
Reinitz J, Levine M: Control of the initiation of homeotic gene expression by the gap genes giant and tailless in Drosophila. Dev Biol. 1990, 140: 57-72. 10.1016/0012-1606(90)90053-L.
Capovilla M, Eldon ED, Pirrotta V: The giant gene of Drosophila encodes a b-ZIP DNA-binding protein that regulates the expression of other segmentation gap genes. Development. 1992, 114: 99-112.
Schröder C, Tautz D, Seifert E, Jäckle H: Differential regulation of the two transcripts from the Drosophila gap segmentation gene hunchback. EMBO J. 1988, 7 (9): 2881-2887.
This work was supported by the Netherlands Organization for Scientific Research, project NWO-CLS 635.100.010 ("3d-RegNet: simulation of developmental regulatory networks", http://www.science.uva.nl/research/scs/3D-RegNet) and by the EC (MORPHEX, NEST Contract No 043322). We used data from the FlyEx database http://flyex.ams.sunysb.edu/flyex/. We thank Dr. Johannes Jaeger for his wise suggestions.
The authors declare that they have no competing interests.
All authors participated in the design of the study, and wrote the manuscript together. YF-N and MP conducted the analysis of data and simulations, and made the figures. All authors read and approved the final manuscript.
Yves Fomekong-Nanfack, Marten Postma contributed equally to this work.