- Research note
- Open Access
Modelling steroidogenesis: a framework model to support hypothesis generation and testing across endocrine studies
BMC Research Notesvolume 11, Article number: 252 (2018)
Steroid hormones are responsible for the control of a wide range of physiological processes such as development, growth, reproduction, metabolism, and aging. Because of the variety of enzymes, substrates and products that take part in steroidogenesis and the compartmentalisation of its constituent reactions, it is a complex process to visualise and document. One of the goals of systems biology is to quantitatively describe the behaviour of complex biological systems that involve the interaction of many components. This can be done by representing these interactions visually in a pathway model and then optionally constructing a mathematical model of the interactions.
We have used the modified Edinburgh Pathway Notation to construct a framework diagram describing human steroidogenic pathways, which will be of use to endocrinologists. To demonstrate further utility, we show how such models can be parameterised with empirical data within the software Graphia Professional, to recapitulate specific examples of steroid hormone production, and also to mimic gene knockout. These framework models support in silico hypothesis generation and testing with utility across endocrine endpoints, with significant potential to reduce costs, time and animal numbers, whilst informing the design of planned studies.
Steroid hormones are responsible for the control of a wide range of physiological processes such as development, growth, reproduction, metabolism, and aging. Mammalian steroidogenesis uses cholesterol as a starting substrate to produce the steroid hormone classes of androgens, estrogens, progestogens, corticosteroids and mineralocorticoids. The initial conversion of cholesterol to the major biologically active steroids takes place primarily in the gonads, adrenals and placenta, but further specific interconversions can take place in peripheral tissues due to local expression of other enzymes (reviewed in ).
Because of the variety and location of the components that take part in steroidogenesis it is a complex process to visualise and document. However, over the years many attempts have been made, and searching for ‘steroidogenesis’ in Google images provides examples of these results. Some diagrams are incomplete and focus only on the steroid products without representing the enzymes that produce them; others focus only on the products of one particular organ. Steroidogenic enzymes often have more than one name or symbol (for example the enzyme cytochrome p450 side-chain cleavage is often referred to as p450scc in older literature, but has the gene symbol CYP11A1 in humans). Many steroidogenesis diagrams often use non-standard nomenclature, or interchange gene names across species when the gene is specific to one species. Almost all pathway diagrams of steroidogenesis have no way of incorporating references into the diagram reducing their value as a learning resource for the end user.
One of the goals of systems biology is to quantitatively describe the behaviour of complex biological systems that involve the interaction of many components. This can be done by representing these interactions visually in a pathway model and then optionally constructing a mathematical model of the interactions. Steroidogenesis would greatly benefit from a formalised graphical documentation using a standard notation with links to original research articles and chemical structures of steroid intermediates. If the diagram could be parameterised to form a dynamic mathematical model it could potentially be used to predict what steroid pathways would be active and what products could be made by tissues expressing a particular combination of steroidogenic enzymes.
In this paper, we have used the modified Edinburgh pathway notation (mEPN) to construct a framework diagram describing human steroidogenic pathways, which will be of use to endocrinologists. To demonstrate further utility, we show how such models can be parameterised with empirical data within the software Graphia Professional , to recapitulate specific examples of steroid hormone production, and also to mimic gene knockout. These framework models support in silico hypothesis generation and testing with utility across endocrine endpoints, with significant potential to reduce costs, time and animal numbers, whilst informing the design of planned studies.
Construction of pathway models of steroidogenesis using the mEPN notation
A model of human steroidogenesis is presented in Fig. 1, representing a framework of the reactions that produce biologically active steroids under normal conditions. Diagrams were compiled using the modified Edinburgh pathway notation (mEPN) using the network editing software yED (http://www.yworks.com). The editable version of this diagram is available as a ‘.graphml’ file that can be opened in yED (Additional file 1). Steroid and enzyme interaction information was obtained from Miller et al.  and associated references. Active pathways in different tissues (namely the adrenal reticularis, glomerulosa and fasciculata, the testis, ovary, prostate and placenta, Additional file 2) are highlighted with red boxes to easily visualise the steroidogenic reactions that take place in each tissue.
mEPN is based on the principles of ‘process diagrams’ and is designed to be unambiguous yet concise . A detailed protocol of how to edit mEPN diagrams has been recently published . Both the biological entities (such as proteins or steroids) and the way they interact with each other (such as phosphorylation or dimerisation) are represented as components in the pathway. Small biochemicals such as steroids are represented by hexagons, all proteins (enzymes) by rounded rectangles and all genes by parallelograms. Black rectangles allow for parameterisation of models whereby initial token input on nodes feeding into the pathway can be defined (Fig. 1b). It can be expanded to produce large, clear and informative pathway models [5, 6].
We have chosen to label steroids and their intermediates based on names in common use in the biological community but have also linked nodes to the Chemspider database (http://www.chemspider.com). This provides a reference to the exact biochemical structure a molecule represents and gives alternate names. The link can be opened by pressing the F8 key when the node is highlighted in yED (Fig. 1c). There are number of naming conventions for biochemical molecules. The comprehensive diagram of the major steroidogenic pathways (Fig. 1a) contains nodes to represent both the gene and the protein produced for each enzyme isoform, and therefore each gene node is linked to Ensembl (http://www.ensembl.org) .
Construction and parameterisation for dynamic flow of a cell-specific model of rat Leydig cell steroidogenesis using previously-published experimental data
Whilst the formalised framework diagram has utility as a resource in its own right, the ability to parameterise the pathway and run and test simulations immeasurably adds to its overall value. We constructed and parametrised a specific model of rat Leydig cell steroidogenesis during postnatal development and adulthood (Fig. 2) focussing on the specific reactions that Leydig cells use to produce their main steroid product: androgens. The simplified Leydig cell version (Fig. 2a) uses a single protein node to represent all isoforms of a particular enzyme and so an Ensembl link is not included. The editable version of the simplified diagram is also available as a ‘.graphml’ file that can be opened in yED, (Additional file 3).
Parameterisation of pathway diagrams constructed in yED to run as a signalling Petri nets (SPNs) in the software Graphia Professional is a logical process and no formal training in mathematical modelling is necessary for the user. A detailed description of how to parameterise yED diagrams so that they can be run as SPNs in Graphia Professional (Kajeka, Edinburgh, UK, formerly BioLayout Express3D)  can be found in Livigni et al. . By representing a complex network as a Petri net, the SPN method models signal flow as the pattern of token accumulation at protein nodes over time.
Parameterisation of the cell-specific diagram was achieved using previously published enzyme activity data measured at three different stages of rat Leydig cell maturation . In yED, tokens representing the enzyme activity in pmol/minute/million cells were added to the arrows connecting the black input nodes of each of the steroidogenic enzyme nodes using the notation a–b,c;d–e,f where ‘a–b’ are the first and last time blocks that you would like the number of tokens ‘c’ to be added to the model and ‘d–e’ are the first and last time blocks that you would like the number of tokens ‘f’ to be added to the model (and so on for the number of variable inputs required), as shown in Fig. 2a. The first 20 time blocks represent the ‘progenitor’ Leydig cell stage present at around postnatal day (pnd) 21 in the rat. Time blocks 21–50 represent the ‘immature’ Leydig cell stage present at around pnd 35 and time blocks 51–100 represent the mature adult Leydig cell that constitute all of the Leydig cells in the testis from pnd 90 onwards. The variation in enzyme activity at the three stages is illustrated by the black bar graphs next to the enzyme input nodes in Fig. 2a. The editable version of the parameterised diagram is available as a ‘.graphml’ file that can be opened in yED (Additional file 4).
The parameterised diagram was run as a SPN in Graphia Professional over 100 time blocks, 500 runs and with standard normal stochastic distribution and consumptive transitions . The token flow was visualised as an animation seen in Additional file 5. Figure 2b shows a screenshot of the animated output of the token flow at each of the three stages (screenshots taken at time block 20 representing progenitor, 50 representing immature and 100 representing adult Leydig cell stages). This provides an overview of all of the nodes in the diagram and helps visualise the pathways of token flow. Two nodes were selected for specific visualisation in Fig. 2c: testosterone and 3α-androstanediol (‘3α-diol’). If these outputs are taken as a prediction of the relative production rate of these two steroids at immature and adult Leydig cell stages, we would predict that 3α-diol is more abundant than testosterone in immature Leydig cells and that testosterone is more abundant than 3α-diol in adult Leydig cells. This prediction of the model is consistent with previous experimental measurement , showing that our model appropriately recapitulates the in vivo situation and thus has utility for hypothesis testing.
Using the model to predict steroid production in Hsd17b3 deficiency
Enzymes of the 17-beta hydroxysteroid dehydrogenase class catalyse the conversion between 17-keto and 17-hydroxy-steroids. Different isoforms of the enzyme are expressed in different steroidogenic tissues. 17-beta hydroxysteroid dehydrogenase type 3 is the isoform expressed by Leydig cells in humans (HSD17B3), mice and rats (Hsd17b3)  and preferentially catalyses the conversion of androstenedione to testosterone and androstanedione to DHT. Male humans with mutations in HSD17B3 present with varying degrees of physiological undervirilisation and plasma androstenedione levels at the time of puberty are usually ten times normal levels . To demonstrate the predictive power of our in silico model, we mimicked a loss of function mutation in Hsd17b3 by removing the token input from the HSD17B node of the rat Leydig cell model (Fig. 3a, b), and re-ran the simulation. This time tokens accumulated at the androstenedione node, with no testosterone produced, consistent with the circulating androgen profile observed in patients with a loss of function of HSD17B3 (Fig. 3c). When visualised as a graph (Fig. 3d), androstenedione production is shown to increase as Leydig cells mature during postnatal life, whereas no testosterone is produced. This simple demonstration shows the utility of the model for predicting outcomes of genetic or pharmacological manipulations before beginning any laboratory or in vivo work, and has the potential to be scaled with multiple knockouts modelled simultaneously.
The system we describe here presents many significant advantages over previous modelling systems. The software is free and readily available, and is supported by two recent publications explaining the underlying methodology , and specific protocols that describe the editing of existing models and construction of new models. In its graphical form within yED, specific nodes within a diagram can be hyperlinked to publications describing the supporting evidence and to references to correct gene and chemical nomenclature. As such the diagram can represent a visual bibliography of known interactions and supporting data, and we have found this to be a much-appreciated resource by anyone grappling to conceptualise the complexities of the endocrine system.
The true power of the framework diagram is revealed when combined with stochastic modelling within Graphia Professional. Traditionally, systems dynamics are described using continuous deterministic mathematical models, which assume that the system has no unpredictability and that the precise behaviour of its components is entirely pre-determined. However, biological systems are intrinsically stochastic and there is evidence that stochasticity is advantageous . In this case, Petri nets, which are a mathematical modelling language for the description of distributed systems, allow for the study of dynamics without the need to have detailed information on the kinetics. It also means that the system is significantly more ‘biologist friendly’ than mathematical modelling through ordinary differential equations.
The system can be used where information is missing, as it is possible to substitute a single arrow (edge) to represent an uncharacterised event between two established known molecules, which permits modelling to continue without possession of all information. Thus, some areas of the model may be incredibly detailed, whilst others are described in less detail. This may identify areas and components that are missing, but must be necessary, thereby focussing hypothesis generation and laboratory experiments in these key locations to refine understanding.
In conclusion, the development of this framework model of steroidogenesis using free software to edit and construct new models will support in silico hypothesis generation and testing across many endocrine endpoints. Use of this system has significant potential to reduce costs, time and animal numbers, whilst informing the design of planned studies.
- 17α-OH DHP:
cytochrome p450 side-chain cleavage
17β-hydroxysteroid dehydrogenase (type 3)
modified Edinburgh pathway notation
signalling Petri net
Miller WL, Auchus RJ. The molecular biology, biochemistry, and physiology of human steroidogenesis and its disorders. Endocr Rev. 2011;32(1):81–151.
O’Hara L, et al. Modelling the structure and dynamics of biological pathways. PLoS Biol. 2016;14(8):e1002530.
Freeman TC, et al. The mEPN scheme: an intuitive and flexible graphical system for rendering biological pathways. BMC Syst Biol. 2010;4:65.
Livigni A, et al. A graphical and computational modeling platform for biological pathways. Nat Protoc. 2018;13(4):705–22.
Raza S, et al. Construction of a large scale integrated map of macrophage pathogen recognition and effector systems. BMC Syst Biol. 2010;4:63.
Raza S, et al. A logic-based diagram of signalling pathways central to macrophage activation. BMC Syst Biol. 2008;2:36.
Aken BL, et al. The Ensembl gene annotation system. Database (Oxford). 2016. https://doi.org/10.1093/database/baw093.
Ge RS, Hardy MP. Variation in the end products of androgen biosynthesis and metabolism during postnatal differentiation of rat Leydig cells. Endocrinology. 1998;139(9):3787–95.
Tsai-Morris CH, et al. The rat 17beta-hydroxysteroid dehydrogenase type III: molecular cloning and gonadotropin regulation. Endocrinology. 1999;140(8):3534–42.
Geissler WM, et al. Male pseudohermaphroditism caused by mutations of testicular 17 beta-hydroxysteroid dehydrogenase 3. Nat Genet. 1994;7(1):34–9.
Heams T. Randomness in biology. Math Struct Comput Sci. 2014;24(3):e240308.
Conception or design of the work: LO, PJO, TCF and LBS, Data analysis and interpretation: LO, LBS. Drafting and approval of final article: LO, PJO, TCF, LBS. All authors read and approved the final manuscript.
There is now a commercial and supported version of BioLayout Express3D called Graphia Professional, produced by Kajeka Ltd., (Edinburgh, UK) that possesses all the functionality described here for pathway modeling. T.C.F. is a founder and director of Kajeka. The other authors declare no competing interests.
Availability of data and materials
All data generated or analysed during this study are included in this published article (and its Additional files).
Consent for publication
Ethics approval and consent to participate
This work was funded by BBSRC Project Grant awards (BB/J015105/1: to LBS, TCF and PJO); (BB/N007026/1: to LBS, LO and TCF) and a Medical Research Council Programme Grant award (MR/N002970/1: to LBS).
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.