Identification of a novel first exon and transcription start sites of the murine ghrelin gene
It has previously been reported that the mouse ghrelin gene consists of four coding exons (exons 1 to 4) and a short, non-coding 19 bp first exon [3], which we have termed exon 0. To determine if additional first exon and transcription start sites are present, 5' RACE (rapid amplification of 5' complementary DNA ends) was performed with exon 1-specific reverse primers and a RACE-ready panel of anchored cDNA libraries derived from 24 mouse tissues (OriGene, Rockville, MD). A list of exons and exon-intron boundaries of the ghrelin locus derived-transcripts identified in this and previous studies is given in [Additional file 1].
Sequencing of clones from the murine stomach, lung, and 9.5-day embryo revealed two transcription start sites (TSS) in exon 0 (Fig. 1), which are 48 bp [GenBank: FJ355944] and 19 bp [GenBank:FJ355945] in size. The 19 bp exon 0 (exon 0a) is identical to a sequence previously reported in the mouse stomach [3], while the 48 bp exon 0 (exon 0b) is flanked by a CAGE tag starting site (CTSS) (Fig. 1). CTSSs are obtained by large-scale sequencing of concatemers derived from the 5' ends of capped mRNA and indicate the transcription start site [4].
In our earlier report we noted that a region immediately upstream of exon 1 (intron 0) of the ghrelin gene is highly conserved between the mouse and human orthologues [2]. This suggested that transcription of the murine ghrelin gene could also be initiated from within intron 0. We isolated murine 5' RACE clones that contained extended exon 1 sequence (Fig. 1). These clones were isolated from 9.5-day embryo, 12.5-day embryo, adult lung, and adult adrenal gland [GenBank:FJ355940–FJ355943] and correspond to four transcription start sites within intron 0 of the mouse ghrelin gene. Interestingly, an extended exon 1 that contains 61 bp of intron 0 sequence (exon 1e) is present in a conserved region immediately upstream of exon 1 [2] and is flanked by the previously reported human transcription start site [5] (Fig. 2).
In a recent study we also identified a human ghrelin exon (exon -1) 2.6 kb upstream of the preproghrelin translation start site in exon 1 [2]. These human exon 1-derived ghrelin transcripts contain a putative secretion signal peptide, which is not present in the rodent sequence, and may give rise to novel peptides [2]. Mouse ghrelin appears to lack exon -1, as we have been unable to identify murine exon -1 ghrelin sequence using 5' RACE and RT-PCR (data not shown). Promoter sequences are known to evolve rapidly and show a high rate of sequence turnover [6, 7]; thus, it is not surprising that the region corresponding to murine exon -1 is not a functional ghrelin gene exon in the mouse.
The extended exon 1 of Ghrl is transcribed and gives rise to full-length preproghrelin transcripts
While 5' RACE data [5] and sequence from a full-length cDNA clone (derived from primary cultures of cystic fibrosis lung epithelial cells) [GenBank:BM982194] demonstrate the existence of human full-length preproghrelin transcripts with an extended exon 1, there has been no evidence that this region is transcribed in the mouse. To verify the existence of the extended exon 1 we performed RT-PCR experiments (Fig. 3). RT-PCRs, employing a sense primer in the extended exon 1 and an antisense primer in exon 1, confirmed that exon 1 is transcribed and not a result of genomic or exogenous contamination (Fig. 3B). Next, we obtained an amplicon containing the extended exon 1 to 4 in the liver and spleen [GenBank:FJ914266], proving experimentally that the extended exon 1 can give rise to full-length preproghrelin transcripts (Fig. 3C).
It was previously reported that the murine ghrelin gene consists of five exons and a single transcription start site in a short, 19 bp exon 0 [3]. Our results demonstrate that the murine gene consists of at least two first exons (exon 0 and exon 1) (Fig. 4) with multiple transcription start sites within each first exon region, which is typical of broad-type promoters [8].
Ghrelin transcripts with an extended exon 1 are highly expressed in a limited number of tissues
We also investigated the distribution of extended exon 1-containing Ghrl transcripts by transcript profiling in 36 normal mouse tissues. This analysis indicated that the extended exon 1 species are predominantly expressed in the spleen, adrenal gland, stomach, skin, adipose tissue, and epididymis (Fig. 5). Interestingly, the human extended exon 1 is also differentially expressed, with equal levels of the 20 bp exon 0 and extended exon 1 in the stomach [5]. In contrast, very high levels of the extended exon 1 and almost undetectable levels of the 20 bp exon 0 have been reported in the human thyroid medullary carcinoma TT cell line [5]. In both humans and rodents, total ghrelin mRNA levels are very high in the stomach, while much lower levels can be found in other tissues, including the adrenal gland [9, 10], suggesting that the expression of the extended exon 1 may be developmental-, cell- and tissue-specific.
A possible role for 5' variant exons of preproghrelin transcripts in translational control
As observed previously for exon 0 of the human ghrelin gene [2], several very short upstream open reading frames (uORFs) are present in the extended exon 1 and exon 0 sequence (data not shown). Upstream open reading frames, mRNA secondary structure and other motifs in 5' untranslated exons have been shown to regulate the translation of developmental genes [11]. Interestingly, it has been reported that ghrelin mRNA and protein levels do not directly correlate in the rat [12]. We hypothesise that this may be caused by the transcription of different first Ghrl exons with different preproghrelin translation efficiencies. This mechanism is exemplified in the chicken embryo where transcripts harbouring uORFs allow low-level translation of proinsulin, whereas a higher level of proinsulin expression is achieved in the adult pancreas by transcription of mRNAs from a downstream first exon devoid of uORFs [13]. Moreover, alternative 5' untranslated exons can have an mRNA secondary structure that restrains translation, particularly if a hairpin occurs close to the 5' cap, which is the ribosomal entry site [14]. The short 19 bp exon 0, which we and others [3] have described, is devoid of upstream open reading frames or stable secondary structure. This transcript, therefore, could be more efficiently translated than the 233 bp 5' extended exon 1 (exon 1e) that contains a 266 bp 5' untranslated region, for example (see [Additional file 2]).