Characteristic differences between the promoters of intron-containing and intronless ribosomal protein genes in yeast
© Roepcke et al; licensee BioMed Central Ltd. 2008
Received: 27 May 2008
Accepted: 29 October 2008
Published: 29 October 2008
More than two thirds of the highly expressed ribosomal protein (RP) genes in Saccharomyces cerevisiae contain introns, which is in sharp contrast to the genome-wide five percent intron-containing genes. It is well established that introns carry regulatory sequences and that the transcription of RP genes is extensively and coordinately regulated. Here we test the hypotheses that introns are innately associated with heavily transcribed genes and that introns of RP genes contribute regulatory TF binding sequences. Moreover, we investigate whether promoter features are significantly different between intron-containing and intronless RP genes.
We find that directly measured transcription rates tend to be lower for intron-containing compared to intronless RP genes. We do not observe any specifically enriched sequence motifs in the introns of RP genes other than those of the branch point and the two splice sites. Comparing the promoters of intron-containing and intronless RP genes, we detect differences in number and position of Rap1-binding and IFHL motifs. Moreover, the analysis of the length distribution and the folding free energies suggest that, at least in a sub-population of RP genes, the 5' untranslated sequences are optimized for regulatory function.
Our results argue against the direct involvement of introns in the regulation of transcription of highly expressed genes. Moreover, systematic differences in motif distributions suggest that RP transcription factors may act differently on intron-containing and intronless gene promoters. Thus, our findings contribute to the decoding of the RP promoter architecture and may fuel the discussion on the evolution of introns.
Hypothesis and Work Plan
In this study, we investigate three hypotheses. First, introns are innately associated with heavily transcribed genes . Second, introns of RP genes carry regulatory TF binding motifs [2–4]. And third, promoter features like Rap1 binding sites or the GC base profile are significantly different between intron-containing and intronless RP genes . To this end, we construct three promoter sets of intron-containing and intronless RP genes, and of non-RP lowly expressed intron-containing genes [6–10]. We compare mRNA expression levels, transcription rates, 5'UTRs, and base compositions around the TSS. Additionally, we scan the promoter sequences for potential binding sites of several transcription factors and investigate their frequencies and localizations relative to the TSS. Finally, we test effects of identified promoter features on RP gene expression by linear regression analysis. For a further background we refer to Additional file 1.
Gene Structure, Expression and Transcription Rate
In order to obtain more accurate information about the positioning of potential regulatory motifs, we incorporate transcription start site (TSS) predictions derived from 5'SAGE experiments . In a recent study, TSSs were determined for the majority of yeast genes and there is good concordance between the results of the two studies [[9, 12], see Additional file 1]. We use the predictions of the 5'SAGE study throughout this work. For 90 of the 100 intron-containing and 33 of the 37 intronless RP genes, we find TSS predictions in this data set. We restrict further analyses to this subset of 123 genes. Traditionally, for the study of the relative localization of transcription factor binding sites (TFBS), the translation start codon ATG is taken as a surrogate for the TSS, which can be rather inaccurate especially for genes that contain an intron in their 5'UTR (leader intron). We select an additional set of 35 lowly expressed intron-containing genes that are also present in the 5'SAGE data set in order to contrast our results for the RP genes [, Fig. 1B, see Additional file 2].
Distribution of Rap1 Binding Motifs
For factors that are known to regulate RP gene transcription and those that have been predicted by genome-scale experiments, we select position weight matrices (PWM) to represent the binding specificity and scan the region from 600 bp upstream of the TSS to 600 bp downstream of all the genes of our three sets for potential binding sites using T-Reg [see Additional files 1, 3, 4, 5, 6, 7, 8, 9 for methods and for more findings].
Distribution of IFHL
In contrast to Fhl1 and Sfp1 motifs (see Additional file 1 for details), the IFHL motif occurs quite differently in the two RP promoter sets and is barely present in promoters of lowly transcribed genes (Additional file 9). We identify 69 instances in 40 intron-containing genes between positions -400 and -150 (Additional file 1, 9). Nine intronless genes contain IFHL motifs, five of which are in upstream promoter regions comparable to intron-containing genes (position -400 to -150). The IFHL motif is preferentially located downstream of the Rap1 sites within a distance of 50 bp. Sometimes the two motifs overlap. This is in accordance with previous results . Furthermore, the IFHL motifs in the upstream region of 24 intron-containing genes occur in duplicate within a distance of less than 100 bp (Tab. 1 in Additional file 1). Other than the mentioned differences in the positioning of the Rap1 motifs, the distribution of IFHL displays the most pronounced differences between intron-containing and intronless RP genes (Chi-squared test p-value: 0.04124).
Two findings of our analysis argue against the direct involvement of introns in the regulation of transcription of the highly expressed group of ribosomal protein genes. First, we show that introns are not necessary for RP genes in yeast to be heavily transcribed. Second, introns of RP genes are not enriched in binding motifs of known or putative RP transcription factors. Furthermore, we test the effect of promoter features on expression level and transcription rate by linear regression analysis. This is important because at present we cannot explain the large variety of transcription rates and of expression levels of the highly and coordinately expressed RP genes. We find that the most significant features are, for the transcription rate, the presence of introns and for the expression level, the folding free energy of the 5'-terminal sequence. Our results help to decipher the RP promoter architecture towards a prediction of transcription rates based on the presence and strength of sequence features.
J. Zhang acknowledges support received from the National Natural Science Foundation of China (30360027).
- Warner JR, Vilardell J, Sohn JH: Economics of ribosome biosynthesis. Cold Spring Harb Symp Quant Biol. 2001, 66: 567-574. 10.1101/sqb.2001.66.567.View ArticlePubMedGoogle Scholar
- Bhattacharyya N, Banerjee D: Transcriptional regulatory sequences within the first intron of the chicken apolipoproteinAI (apoAI) gene. Gene. 1999, 234 (2): 371-380. 10.1016/S0378-1119(99)00183-3.View ArticlePubMedGoogle Scholar
- Chen J, Hayes P, Roy K, Sirotnak FM: Two promoters regulate transcription of the mouse folylpolyglutamate synthetase gene three tightly clustered Sp1 sites within the first intron markedly enhance activity of promoter B. Gene. 2000, 242 (1–2): 257-264. 10.1016/S0378-1119(99)00507-7.View ArticlePubMedGoogle Scholar
- Wenz P, Schwank S, Hoja U, Schuller HJ: A downstream regulatory element located within the coding sequence mediates autoregulated expression of the yeast fatty acid synthase gene FAS2 by the FAS1 gene product. Nucleic Acids Res. 2001, 29 (22): 4625-4632. 10.1093/nar/29.22.4625.PubMed CentralView ArticlePubMedGoogle Scholar
- Lascaris RF, Groot E, Hoen PB, Mager WH, Planta RJ: Different roles for abf1p and a T-rich promoter element in nucleosome organization of the yeast RPS28A gene. Nucleic Acids Res. 2000, 28 (6): 1390-1396. 10.1093/nar/28.6.1390.PubMed CentralView ArticlePubMedGoogle Scholar
- Clark TA, Sugnet CW, Ares M: Genomewide analysis of mRNA processing in yeast using splicing-specific microarrays. Science. 2002, 296 (5569): 907-910. 10.1126/science.1069415.View ArticlePubMedGoogle Scholar
- Spingola M, Grate L, Haussler D, Ares M: Genome-wide bioinformatic and molecular analysis of introns in Saccharomyces cerevisiae. Rna. 1999, 5 (2): 221-234. 10.1017/S1355838299981682.PubMed CentralView ArticlePubMedGoogle Scholar
- Planta RJ, Mager WH: The list of cytoplasmic ribosomal proteins of Saccharomyces cerevisiae. Yeast. 1998, 14 (5): 471-477. 10.1002/(SICI)1097-0061(19980330)14:5<471::AID-YEA241>3.0.CO;2-U.View ArticlePubMedGoogle Scholar
- Miura F, Kawaguchi N, Sese J, Toyoda A, Hattori M, Morishita S, Ito T: A large-scale full-length cDNA analysis to explore the budding yeast transcriptome. Proc Natl Acad Sci USA. 2006, 103 (47): 17846-17851. 10.1073/pnas.0605645103.PubMed CentralView ArticlePubMedGoogle Scholar
- Nakao A, Yoshihama M, Kenmochi N: RPG: the Ribosomal Protein Gene database. Nucleic Acids Res. 2004, D168-170. 10.1093/nar/gkh004. 32 Database
- Garcia-Martinez J, Aranda A, Perez-Ortin JE: Genomic run-on evaluates transcription rates for all yeast genes and identifies gene regulatory mechanisms. Mol Cell. 2004, 15 (2): 303-313. 10.1016/j.molcel.2004.06.004.View ArticlePubMedGoogle Scholar
- Zhang Z, Dietrich FS: Mapping of transcription start sites in Saccharomyces cerevisiae using 5' SAGE. Nucleic Acids Res. 2005, 33 (9): 2838-2851. 10.1093/nar/gki583.PubMed CentralView ArticlePubMedGoogle Scholar
- Zhang J, Hu J, Shi XF, Cao H, Liu WB: Detection of potential positive regulatory motifs of transcription in yeast introns by comparative analysis of oligonucleotide frequencies. Comput Biol Chem. 2003, 27 (4–5): 497-506. 10.1016/j.compbiolchem.2003.09.005.View ArticlePubMedGoogle Scholar
- Hofacker IL: Vienna RNA secondary structure server. Nucleic Acids Res. 2003, 31 (13): 3429-3431. 10.1093/nar/gkg599.PubMed CentralView ArticlePubMedGoogle Scholar
- Ringner M, Krogh M: Folding free energies of 5'-UTRs impact post-transcriptional regulation on a genomic scale in yeast. PLoS Comput Biol. 2005, 1 (7): e72-10.1371/journal.pcbi.0010072.PubMed CentralView ArticlePubMedGoogle Scholar
- Lascaris RF, Mager WH, Planta RJ: DNA-binding requirements of the yeast protein Rap1p as selected in silico from ribosomal protein gene promoter sequences. Bioinformatics. 1999, 15 (4): 267-277. 10.1093/bioinformatics/15.4.267.View ArticlePubMedGoogle Scholar
- Beer MA, Tavazoie S: Predicting gene expression from sequence. Cell. 2004, 117 (2): 185-198. 10.1016/S0092-8674(04)00304-6.View ArticlePubMedGoogle Scholar
- Wade JT, Hall DB, Struhl K: The transcription factor Ifh1 is a key regulator of yeast ribosomal protein genes. Nature. 2004, 432 (7020): 1054-1058. 10.1038/nature03175.View ArticlePubMedGoogle Scholar