- Technical Note
- Open Access
Spatial gene expression quantification: a tool for analysis of in situ hybridizations in sea anemone Nematostella vectensis
BMC Research Notes volume 5, Article number: 555 (2012)
Spatial gene expression quantification is required for modeling gene regulation in developing organisms. The fruit fly Drosophila melanogaster is the model system most widely applied for spatial gene expression analysis due to its unique embryonic properties: the shape does not change significantly during its early cleavage cycles and most genes are differentially expressed along a straight axis. This system of development is quite exceptional in the animal kingdom.
In the sea anemone Nematostella vectensis the embryo changes its shape during early development; there are cell divisions and cell movement, like in most other metazoans. Nematostella is an attractive case study for spatial gene expression since its transparent body wall makes it accessible to various imaging techniques.
Our new quantification method produces standardized gene expression profiles from raw or annotated Nematostella in situ hybridizations by measuring the expression intensity along its cell layer. The procedure is based on digital morphologies derived from high-resolution fluorescence pictures. Additionally, complete descriptions of nonsymmetric expression patterns have been constructed by transforming the gene expression images into a three-dimensional representation.
We created a standard format for gene expression data, which enables quantitative analysis of in situ hybridizations from embryos with various shapes in different developmental stages. The obtained expression profiles are suitable as input for optimization of gene regulatory network models, and for correlation analysis of genes from dissimilar Nematostella morphologies. This approach is potentially applicable to many other metazoan model organisms and may also be suitable for processing data from three-dimensional imaging techniques.
Spatial gene expression assays are a substantial tool for verifying predicted regulatory interactions and for predicting properties of missing components in a regulation network [1, 2]. Still, their largest potential is in inferring numerical models of regulatory interaction networks, which is demonstrated for the embryonic development of fruit fly Drosophila melanogaster. To perform accurate simulations, the spatial gene expression patterns are digitally quantified and formatted to consistent profiles.
In systems biology, computational tools have become indispensable for deriving and validating gene regulation networks . In data-driven modeling, parameter estimation can determine which set of rules represents the best network model to match a set of observations. Figure 1 shows an overview of the modeling cycle. First, a general mathematical framework is selected. The gene circuit model [4, 5] (which is derived from the connectionist model ) is a convenient formalism that does not require knowledge about interactions or their mechanisms. The actual modeling starts with choosing parameter values for the general equations; the initial parameters are defined by the user or generated by an algorithm. The resulting equations are applied to the concentration profiles at the first timepoint, derived from experimental data. This simulation produces concentration profiles at various timepoints. The simulated profiles are compared with observed expression profiles and the similarity is calculated with a predefined fitness measure. An optimization algorithm then picks new parameter values and starts a new simulation cycle. The optimization process ends when a stopping criterion is reached, such as a fixed time duration or number of cycles, a satisfactory fitness or a series of cycles without significant fitness improvement. The output is the set of parameters that define the equations which represent the observed gene expression profiles best.
In any modeling approach, meaningful results require accurate reference data to initiate the simulation and to evaluate the simulation output. For gene regulation networks, this means that gene expression patterns should be quantified in a consistent manner. A digital quantification procedure has been created and validated for the fruit fly .
Gene expression in Drosophila is quantified along a straight line during superficial cleavage. In this stage, nuclei are dividing within a single cell membrane and the embryo’s outline does not change significantly. In most other animals however, nuclear division is coupled to cell division during cleavage and the early embryo displays rapid cell movement and morphological changes. This is why we developed a method for gene expression quantification that accounts for a complex and changing embryo morphology.
Over the past decade, Nematostella has become an important model organism in the field of evolutionary developmental biology . As a research object, the animal is easy to culture and its small size and transparent body wall make it suitable for all kinds of microscopy. Subsequent gene expression studies and the sequencing of the genome have shown that Nematostella, curiously, shares more genes with humans than either Drosophila melanogaster or C. elegans. Much work on Nematostella has been dedicated to the genetic regulation of development .
The early developmental stages of Nematostella vectensis are displayed in Figure 2 (adapted from ). The Nematostella body wall consists of an outer cell layer (ectoderm) and an inner layer (endoderm). We are primarily concerned with the invagination process called gastrulation, when the presumptive endoderm moves inwards and covers the ectoderm. The side of invagination is the location of the future mouth and is therefore referred to as the oral pole; the opposite side is the aboral pole. Nematostella displays bilateral symmetry: the line between the oral and aboral ends is the primary axis, while the line through the primary septa defines the secondary axis (the green and blue lines, respectively, drawn on Figure 2L).
In this paper we discuss a geometric method for extracting quantitative spatio-temporal gene expression data from in situ hybridizations in the sea anemone Nematostella vectensis. We measure gene expression during gastrulation using a gene expression quantification tool developed by de Jong . We show in some preliminary results how this information can be analysed in a cluster analysis.
Gene expression quantification
Microscopy images accentuate cell layer outlines of Nematostella embryos stained for nuclei and filamentous actin (Figure 3A-D, copied from ). The latter is bound to the membrane as part of the cell cortex, so it highlights cellular shapes. From these images, average embryo morphologies have been derived for various stages of gastrulation (Figure 3E-H). An embryo geometry is derived from a confocal microscopy image (such as Figure 3A) by placing nodes on the cell layer boundaries. Node locations from multiple (2 to 5) geometries are averaged to obtain an average embryo geometry (such as Figure 3E). This averaging reduces the effect of local irregularities, as the average geometry is supposed representative for embryos of a particular age. Strategic points are picked to define a shape suitable for interpolation with the geometry in the subsequent stage. Interpolation of these average morphologies results in a continuous range of embryo morphologies.
In the following example, these graphical shapes are applied to quantify gene expression patterns with the GENEXP program (Additional file 1). Published Nematostella gene expression images are collected in the CnidBase  and Kahi Kai  databases, and Kahi Kai also contains expression images outside journal publications. Still, many Nematostella expression pictures are found in publications outside these databases.
Example 1: a 1D expression profile from a symmetrical pattern
Figure 4A (adapted from ) displays an Nvnos2 in situ hybridization in a late gastrula. The transcripts were hybridized with digoxigenin-labeled RNA probes. The published image is overlaid with the most similar morphology. The graphical geometry is then adjusted to the outline of this particular embryo by dragging the nodes to their final location (Figure 4B). The nodes are connected by a curve (about 105 points) calculated with cubic spline interpolation . The geometry is automatically decomposed into parallel segments along the cell layer and color intensities are measured for all segments. The decomposition is performed by dividing the outer curve into sections with a user-defined length and calculating the nearest point on the inner spline for each boundary. This is repeated for the inner curve at sections with large gaps. The decomposition is finished by smoothening the boundary points on both curves to obtain segments that are more uniformly spaced. For the expression profile the red, green and blue color intensities of all pixels enclosed within a segment are averaged. These average intensities are plotted against the position on the line through the segments’ centers. To make an increasing intensity indicate an increase in concentration, the colors are inverted in Figure 4C and the corresponding intensities along the cell layer are displayed in Figure 4D.
The graph in Figure 4D contains features that do not represent the actual transcript concentration. The “en” and “ec” annotations cause a trough (orange arrow) and a ridge (green arrow), respectively, while an imperfect decomposition has shifted the peaks with regard to the center and introduced some noise (blue arrows). Moreover, the nonuniform background and nonsymmetric lighting cause an asymmetric baseline. The profile is exported to an editor for additional processing to correct for these features. To remove artefacts caused by annotations, the user selects this section of the graph and can choose among linear interpolation, cubic spline interpolation, piecewise cubic Hermite interpolation  and replacement with a specified constant. Noise from erroneous decomposition is smoothened with a lowpass filter (with filter coefficients equal to the reciprocal of the span), known as ‘moving average’ . The graph is lowered with a constant value to subtract the average background and regions without observed expression are put to zero. Both halves are averaged to cancel nonsymmetric influences. The final expression profile is plotted in Figure 4E.
Three-dimensional expression pattern reconstruction
For expression patterns that are radially symmetrical around the primary axis, a one-dimensional profile is a complete description. However, most signaling pathways involve genes that are asymmetrically expressed along the secondary axis. A three-dimensional representation is required to fully define the expression pattern of these genes. Depending on the data available for a gene expression pattern, a suitable method is determined for the approximation of this 3D pattern.
If a set of perpendicular expression images shows that an expression pattern is not radially symmetrical, then a 3D representation is constructed by mixing both images into a three-dimensional array (Figure 5C) with the TWOVIEWS program (Additional file 1). With points S1 on a lateral view in the x,z plane and points S2 on an oral view in the y,z plane, the points P in the 3D array are defined by the algorithm in Figure 6A. Otherwise a radial expression pattern symmetry is assumed, and a 3D representation is created by smoothly averaging pairs of points on the image along circular arcs about an approximate axis of symmetry (Figure 7B) with the TWOVIEWS program (Additional file 1). The pseudocode for this operation is displayed in Figure 6B.
Example 2: a 3D representation from perpendicular embryo views
Figure 5A,B (adapted from ) shows the expression pattern of gene NvFoxB, which is concentrated on two spots on opposite sides of the oral end. Two sets of reference points are picked to scale and align both pictures. The height of the oral image is adjusted to match these points in the lateral image.
Elements for the 3D array that represents the volumetric expression pattern are calculated as the minimum expression from the associated pair of pixels (Figure 5C). A greyscale visualization of this array is displayed in Figure 5D. Two domains appear in the oral region as expected.
The 3D array is an intermediate step in the quantification process that is completed for the next example.
Example 3: a 2D expression landscape from a single embryo view
Figure 7A (based on ) shows the expression of Nvvas2 in the early gastrula stage. The expression domain covers the embryo’s lower half and its future mouth. A line is drawn on the image that divides the expression domain. Each element in the 3D array is the weighted average of the image pixels at both ends of a circular arc around this line (Figure 7B,C).
This discrete volumetric expression array is not yet suited to be compared to other patterns, because their shapes do not match or their cell layers are located at different Cartesian positions. To arrive at consistent profiles, slices are cut through the primary axis and decomposed. The primary axis is drawn on the original image and slices of the 3D array through this axis are constructed (Figure 7D,E). These slices are overlaid with a geometry that fits the native image (Figure 7F) and through the decomposition procedure described in the one-dimensional example, a set of profiles is produced. These profiles are stacked in a 2D array and displayed as a landscape in Figure 7G.
Visualization and clustering
The profiles are easily interpolated and stored in arrays of equal length. Such a collection of standardized arrays can serve as input for conventional comparison and analysis software. For example, Figure 8 shows 41 1D expression profiles from 20 genes, ordered with hierarchical clustering. In this fashion, a database is conveniently displayed and correlating patterns can be identified at a glimpse.
From the cluster tree, the patterns are divided in three main groups, and five individuals with little similarity. In green, all four asymmetric expression patterns are included. The purple patterns are restricted to the endoderm. The yellow group is the largest, containing genes that are expressed in the presumptive pharynx and mouth. The remaining genes are expressed in ectoderm away from the mouth.
The clustering displayed in Figure 8 might be somewhat artificial as the expression domains clearly overlap. Moreover, some regions are elongating faster during gastrulation than others and even after pinning the estimated endoderm-ectoderm boundary, stationary patterns such as NvFoxA seem to migrate. Still, the comparison is very helpful in observing correlations and proposing hypotheses. For example, the three main groups may indicate regulatory modules. More specific, the patterns with broad boundaries belong to embryos in a relatively early stage, indicating a regulatory cascade in which fuzzy domain boundaries are sharpened, comparable to the Drosophila gap genes. An extended and systematic set of profiles would enable an inference of the developmental gene regulation network in Nematostella, based on the modeling techniques and analyses that established many properties of the Drosophila gap gene network [20–22].
Our quantification procedure provides a standardized format for the most diverse spatial visualization techniques. In this paper, hybridized mRNA has been quantified, but the method can be applied to any specific molecular entity. Potential examples include native proteins visualized with antibody staining and overexpressed proteins fused to a fluorescent agent [11, 23].
Current limitations arise from the strong assumptions imposed on the images that are used to construct 3D representations. For a rotated pattern, it is formally assumed that only expression in the plane of dissection is visible, while in fact observed expression is not restricted to this plane.
More serious is the requirement that the embryos used for the reverse projection are completely transparent, while the endoderm and aboral ectoderm are hidden on most oral images. Sometimes, only the cumulative signal on the periphery of an expression region is detected, as in Figure 5A (the speckled region at the arrow and on the opposite side).
The embryo is also assumed to be viewed from exactly perpendicular angles, but the sample is often rotated imperfectly. Additionally, slight deformations can occur during rotation, causing small domains with granular expression to overlap improperly and thus to be misrepresented. These issues are observed in Figure 5 as well.
With the advance of direct three-dimensional imaging the volumetric array construction may become superfluous, and these limitations will be removed. Confocal laser scanning microscopy has already been applied to zebrafish , Drosophila and sea urchin  embryos. This method may provide quantitative, spatial expression data for Nematostella as well. Conversely, general methods for mapping these data to the embryo’s morphology should be useful for comparison and analysis in these other organisms.
An integrated method has been presented that combines geometry extraction and gene expression quantification. The basic concept is that gene expression is conveniently measured along the cell layer in a morphology that can be viewed as a continuous sheet of cells. This straightforward approach can be applied generally to embryos across the animal kingdom. As confocal laser scans with high three-dimensional spatial resolution are widely applied, application of this method is not limited to symmetrical body shapes.
We have shown how to extract quantified gene expression profiles from Nematostella in situ hybridizations, and how a preliminary comparison and cluster analysis lead to new insights. The next step is to estimate parameters that describe interactions among genes in the Nematostella regulatory network. The powerful methods designed for parameter inference [4, 5, 20] and network analysis [21, 22] in Drosophila can now be applied to the standardized gene expression profiles of Nematostella vectensis and other model species in genetics.
Currently, a database of published images is processed into 1D arrays, annotated with roughly estimated development times based on comparison with high resolution micrographs. This comparison is very subjective, as our designation often differs from the developmental stage originally claimed. Moreover, the embryos change very subtly in the hollow sphere stage (Figure 2G) and between invagination and septa formation (Figure 2I), so timestamps are highly ambiguous. If a gene or a combination of genes is found with continuously changing expression patterns, we can derive labelling protocols to determine the exact developmental time. (Registration techniques like this have already been described and proved useful for Drosophila.)
Spatial gene expression quantification can be combined with modern quantitative polymerase chain reaction (qPCR) techniques . The absolute total amount of transcripts measured with qPCR coupled to the spatial distribution from quantified in situ images should enable the calculation of absolute local mRNA concentrations.
Availability and requirements
BioPreDyn (new bioinformatics methods and tools for data-driven predictive dynamic modelling in biotechnological applications)
Project home page
Linux, Windows, MacOS.
Any restrictions to use by non-academics
Li E, Davidson EH: Building developmental gene regulatory networks. Birth Defects Res C Embryo Today. 2009, 87: 123-130. 10.1002/bdrc.20152.
Chan TM, Longabaugh W, Bolouri H, Chen HL, Tseng WF, Chao CH, Jang TH, Lin YI, Hung SC, Wang HD, Yuh CH: Developmental gene regulatory networks in the zebrafish embryo. Biochim Biophys Acta. 2009, 1789: 279-298. 10.1016/j.bbagrm.2008.09.005.
de Jong H: Modeling and simulation of genetic regulatory systems: a literature review. J Comput Biol. 2002, 9: 67-103. 10.1089/10665270252833208.
Reinitz J, Sharp DH: Mechanism of eve stripe formation. Mech Dev. 1995, 49: 133-158. 10.1016/0925-4773(94)00310-J.
Jaeger J, Surkova S, Blagov M, Janssens H, Kosman D, Koslov KN, Myasnikova E, Vanario-Alonso CE, Samsonova M, Sharp DH, Reinitz J, Manu: Dynamic control of positional information in the early Drosophila embryo. Nature. 2004, 430: 368-371. 10.1038/nature02678.
Mjolsness E, Sharp DH, Reinitz J: A connectionist model of development. J Theor Biol. 1991, 152: 429-453. 10.1016/S0022-5193(05)80391-1.
Janssens H, Kosman D, Vanario-Alonso CE, Jaeger J, Samsonova M, Reinitz J: A high-throughput method for quantifying gene expression data from early Drosophila embryos. Dev Genes Evol. 2005, 215: 374-381. 10.1007/s00427-005-0484-y.
Darling JA, Reitzel AR, Burton PM, Mazza ME, Ryan JF, Sullivan JC, Finnerty JR: Rising starlet: the starlet sea anemone, Nematostella vectensis. BioEssays. 2005, 27: 211-221. 10.1002/bies.20181.
Putnam NH, Srivastava M, Hellsten U, Dirks B, Chapman J, Salamov A, Terry A, Shapiro H, Lindquist E, Kapitonov VV, Jurka J, Genikhovich G, Grigoriev IV, Lucas SM, Steele RE, Finnerty JR, Technau U, Martindale MQ, Rokhsar DS: Sea anemone genome reveals ancestral eumetazoan gene repertoire and genomic organization. Science. 2007, 317: 86-94. 10.1126/science.1139158.
Byrum CA, Martindale MQ: Gastrulation in the Cnidaria and the Ctenophora. Gastrulation: From Cells to Embryo. Edited by: Stern CA. 2004, Cold Spring Harbor: Cold Spring Harbor Laboratory Press, 33-50.
Lee PN, Kumburegama S, Marlow HQ, Martindale MQ, Wikramanayake AH: Asymmetric developmental potential along the animal-vegetal axis in the antozoan cnidarian, Nematostella vectensis, is mediated by Dishevelled. Dev Biol. 2007, 310: 169-186. 10.1016/j.ydbio.2007.05.040.
de Jong J: Quantitative analysis of gene expression in Nematostella vectensis. 2009, Section Computational Science: MSc thesis. University of Amsterdam
Magie CR, Daly M, Martindale MQ: Gastrulation in the cnidarian Nematostella vectensis occurs via invagination not ingression. Dev Biol. 2007, 305: 483-497. 10.1016/j.ydbio.2007.02.044.
Cnidarian Evolutionary Genomics Database.http://www.cnidbase.org/index.cgi,
Comparative Marine Invertebrate Gene Expression Database.http://www.kahikai.org/index.php?content=genes,
Extavour CG, Pang K, Matus DQ: Martindale MQ: vasa and nanos expression patterns in a sea anemone and the evolution of bilaterian germ cell specification mechanisms. Evol Dev. 2005, 7: 201-215. 10.1111/j.1525-142X.2005.05023.x.
MathWorks Documentation Center.http://www.mathworks.com/help/matlab/ref/interp1.htm,
MathWorks Documentation Center.http://www.mathworks.com/help/curvefit/smooth.htm,
Magie CR, Pang K, Martindale MQ: Genomic inventory and expression of Sox and Fox genes in the cnidarian Nematostella vectensis. Dev Genes Evol. 2005, 215: 618-630. 10.1007/s00427-005-0022-y.
Fomekong-Nanfack Y, Kaandorp JA, Blom J: Efficient parameter estimation for spatio-temporal models of pattern formation: case study of Drosophila melanogaster. Bioinformatics. 2007, 23: 3356-3363. 10.1093/bioinformatics/btm433.
Fomekong-Nanfack Y, Postma M, Kaandorp JA: Inferring Drosophila gap gene regulatory network: a parameter sensitivity and perturbation analysis. BMC Syst Biol. 2009, 3: 94-10.1186/1752-0509-3-94.
Fomekong-Nanfack Y, Postma M, Kaandorp JA: Inferring Drosophila gap gene regulatory network: pattern analysis of simulated gene expression profiles and stability analysis. BMC Res Notes. 2009, 2: 256-10.1186/1756-0500-2-256.
Wikramanayake AH, Hong M, Lee PN, Pang K, Byrum CA, Bince JM, Xu R, Martindale MQ: An ancient role for nuclear beta-catenin in the evolution of axial polarity and germ layer segregation. Nature. 2003, 426: 446-450. 10.1038/nature02113.
Welten MCM, de Haan SB, van den Boogert N, Noordermeer JN, Lamers GEM, Spaink HP, Meijer AH, Verbeek FJ: ZebraFISH: fluorescent in situ hybridization protocol and three-dimensional imaging of gene expression patterns. Zebrafish. 2006, 3: 465-476. 10.1089/zeb.2006.3.465.
Luengo Hendriks CL, Keränen SV, Fowlkes CC, Simirenko L, Weber GH, DePace AH, Henriquez C, Kaszuba DW, Hamann B, Eisen MB, Malik J, Sudar D, Biggin MD, Knowles DW: Three-dimensional morphology and gene expression in the Drosophila blastoderm at cellular resolution I: data acquisition pipeline. Genome Biol. 2006, 7: R123-10.1186/gb-2006-7-12-r123.
Flynn CJ, Sharma T, Ruffins SW, Guerra SL, Crowley JC, Ettensohn CA: High-resolution, three-dimensional mapping of gene expression using GeneExpressMap (GEM). Dev Biol. 2011, 357: 532-540. 10.1016/j.ydbio.2011.06.033.
Myasnikova E, Samsonova A, Kozlov K, Samsonona M, Reinitz J: Registration of the expression patterns of Drosophila segmentation genes by two independent methods. Bioinformatics. 2001, 17: 3-12. 10.1093/bioinformatics/17.1.3.
Heid CA, Stevens J, Livak KJ, Williams PM: Real time quantitative PCR. Genome Res. 1996, 6: 986-994. 10.1101/gr.6.10.986.
Fritzenwanker JH, Genikhovich G, Kraus Y, Technau U: Early development and axis specification in the sea anemone Nematostella vectensis. Dev Biol. 2007, 310: 264-279. 10.1016/j.ydbio.2007.07.029.
Mathworks Documentation Center.http://www.mathworks.com/help/bioinfo/ref/clustergram.html,
Daniel Botman was funded by the FP7 project BioPreDyn. We would like to thank Mark Martindale (Kewalo Marine Laboratory, University of Hawaii) for allowing us to use the pictures shown in Figure 3.
The authors declare that they have no competing interests.
All authors participated in the design of the study, and wrote the manuscript together. DB conducted the analysis of data and simulations, and made the figures. All authors read and approved the final manuscript.
Electronic supplementary material
Additional file 1:Matlab programs: “genexp.m” for cell layer decomposition of gene expression images and profile editing; “oneview.m” for a 2D gene expression array from a single decomposed image; “twoviews.m” for a 2D array from a decomposed image and perpendicular view. Supporting files: Matlab user interface “genexp.fig” (necessary to run “genexp.m”); directory “embryodata” containing geometry definitions; directory “functions” containing Matlab functions used in the main programs. Matlab script: “stanpro.m” for creating a standardized 1D array from a GENEXP profile figure. Manuals: “readme_genexp.m” for GENEXP program; “readme_oneview.m” for ONEVIEW program; “readme_twoviews.m” for TWOVIEWS program. (TAR 290 KB)
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.