The complete chloroplast genome sequence of Brachypodium distachyon: sequence comparison and phylogenetic analysis of eight grass plastomes
© Bortiri et al; licensee BioMed Central Ltd. 2008
Received: 05 May 2008
Accepted: 31 July 2008
Published: 31 July 2008
Wheat, barley, and rye, of tribe Triticeae in the Poaceae, are among the most important crops worldwide but they present many challenges to genomics-aided crop improvement. Brachypodium distachyon, a close relative of those cereals has recently emerged as a model for grass functional genomics. Sequencing of the nuclear and organelle genomes of Brachypodium is one of the first steps towards making this species available as a tool for researchers interested in cereals biology.
The chloroplast genome of Brachypodium distachyon was sequenced by a combinational approach using BAC end and shotgun sequences derived from a selected BAC containing the entire chloroplast genome. Comparative analysis indicated that the chloroplast genome is conserved in gene number and organization with respect to those of other cereals. However, several Brachypodium genes evolve at a faster rate than those in other grasses. Sequence analysis reveals that rice and wheat have a ~2.1 kb deletion in their plastid genomes and this deletion must have occurred independently in both species.
We demonstrate that BAC libraries can be used to sequence plastid, and likely other organellar, genomes. As expected, the Brachypodium chloroplast genome is very similar to those of other sequenced grasses. The phylogenetic analyses and the pattern of insertions and deletions in the chloroplast genome confirmed that Brachypodium is a close relative of the tribe Triticeae. Nevertheless, we show that some large indels can arise multiple times and may confound phylogenetic reconstruction.
Plastids are key organelles of green plants, carrying out functions like photosynthesis, starch storage, nitrogen and sulfate metabolism, and synthesis of chlorophyll, carotenoids, fatty acids and nucleic acids . Plastids have multiple copies of a circular, double-stranded DNA chromosome, each with a set of approximately 110 genes highly conserved in sequence and organization .
In addition to their important biological roles, plastids have the potential to make a big impact on biotechnology. Plastid transformation, achieved via homologous recombination, is very advantageous compared to nuclear genome transformation mainly because it can generate high levels of gene expression and the recombinant DNA is more easily contained since chloroplasts are maternally inherited in most species of angiosperms .
The family Poaceae, with approximately 10,000 species, contains the world's most important crops. The tribe Triticeae, of subfamily Pooideae, includes species grown in temperate regions, some of which are of great economic importance; i.e., wheat, rye, triticale, and barley. Despite their contribution to human food supply, members of the Triticeae are not easily amenable to functional genomics aimed at crop improvement because of their large genome size and difficulty in transformation.
Brachypodium distachyon, a small grass in the Pooideae, has recently emerged as a new model species for functional genomics of temperate grasses. Brachypodium offers many advantages as a model grass; among them, its reduced stature, short life cycle, and small genome .
In the last few years a considerable effort has been made to develop genetic and molecular tools for Brachypodium, including ESTs , Bacterial Artificial Chromosome (BAC) libraries , cytological characterization of accessions [7–9], and techniques to perform rapid and efficient transformation [10, 11]. Finally, sequencing of the Brachypodium distachyon genotype Bd21 has been initiated by the DOE Joint Genomics Institute and will soon be available to the public.
Here we report the sequencing of the chloroplast genome of the Bd21 genotype of Brachypodium, and perform a sequence analysis and phylogeny reconstruction with the completely sequenced chloroplast genomes from seven grass species. We compare the evolutionary dynamics of Brachypodium chloroplast genes with those of wheat, rice and maize, and discuss the significance of some indels in the framework of grass evolution.
Sequencing of the Brachypodium chloroplast genome
As expected, the chloroplast sequence assembled using the BES contained many gaps due to the distance between restriction sites (Fig. 1). To complete the Brachypodium chloroplast genome, a shotgun sequencing library of DH037I03 was constructed. The complete genome sequence was assembled using 1,725 BES, 410 sequences from the shotgun library, and 264 gap-filling sequences generated by primer walking. The sequence coverage of the entire chloroplast genome is 8.9×.
Genome organization of Brachypodium chloroplast
Grass chloroplast phylogeny based on complete chloroplast genomes
In a landmark article that included data from multiple sources, the Grass Phylogeny Working Group  examined relationships among grasses using a large and diverse assemblage of species. That study highlighted the existence of two major lineages, the BEP clade and the PACCAD clade, that together encompass the majority of grasses. The BEP clade includes the subfamilies Bambusoideae, Ehrhartoideae, and Pooideae. Rice belongs to subfamily Ehrhartoideae while wheat, barley, bentgrass, and Brachypodium are in the Pooideae. The PACCAD clade includes several subfamilies, among them the Panicoideae, a large group of mainly tropical and subtropical species, some of which are important crops worldwide, like maize, sugarcane, and sorghum.
Evolution of Brachypodium chloroplast genes
For a given protein-coding gene, the proportion of substitutions that do not cause a change in the amino acid sequence (synonymous) to those that do (nonsynonymous) is a commonly used estimator of the evolutionary dynamics operating on that gene . To find out if Brachypodium plastid genes show the same evolutionary dynamics as other grasses we calculated the ratio of nonsynonymous to synonymous substitution rates for Brachypodium chloroplast genes using tobacco as an outgroup.
Substitution rates in grasses. Chloroplast genes are divided into seven groups according to the function of their product. For groups of more than one gene the top row gives the mean substitution rate, and the second and third rows show the genes, within that group, with the maximum and minimum rates respectively. ENV: Envelope membrane. MAT: maturaseK. NADH: NADH genes. PS: Photosynthetic genes. RP: ribosomal genes. RNPol: RNA polymerase genes. B: Brachypodium. W: wheat. R: rice. M: maize.
NO. OF GENES
petG, psbI 0
petG, psbI 0
petG psbM psbT 0
petG, psbF psbI psbT 0
petD 1.24 × 10-2
petB 9.52 × 10-3
petD 1.7 × 10-2
petD 6.83 × 10-3
Summarized results of Tajima's  test of relative evolution of Brachypodium chloroplast genes compared with those of wheat, rice, and maize. The P value of genes that evolve at significantly different rates in Brachypodium is shown for each gene and species comparison. When P < 0.05, indicating that rates are significantly different, the species with the highest rate of evolution is shown in parenthesis. B: Brachypodium, W: wheat, R: rice, M: maize.
Species pair comparison
B vs W
B vs R
B vs. M
Sequence comparison among grass chloroplast genomes
The structure and gene number of the chloroplast genome is very similar among land plants, although the Poaceae have three large inversions compared to the canonical plastid genome usually represented by the tobacco chloroplast genome . This conservation of overall structure in the chloroplast genomes of grasses allowed us to align the chloroplast genome sequences of eight grass species at the genome-wide level.
Comparison of the sequences of eight chloroplast genomes (only rice, Brachypodium, wheat, and maize are represented on Fig. 2) reveals several regions of high sequence length polymorphism, as well as shared deletions and insertions. The IRs show lower sequence divergence among grasses than the single-copy region (Fig. 1), a result previously reported by other authors . The region between rbcL and psaI (at position ~54 kb, Fig. 2) is one of the most polymorphic chloroplast loci in grasses. In rice, this region is 1532 bp long and contains ORF133 and the accD gene, but it is much shorter in other grasses. In Brachypodium, both ORF133 and accD are missing, and the entire rbcL-psaI spacer region, containing only the rbcL 3'UTR and psaI promoter sequences, is reduced to 296 bp long.
As expected from its phylogenetic placement, Brachypodium shares several indels with barley, wheat, and bentgrass, all of which are in subfamily Pooideae, including a 410 bp deletion in ORF70 (~14.5 kb, Fig. 2) and the duplication of a 5' portion of ndhH IRb (~102 K in Fig. 2) that is also shared with rice [16, 21]. The size of this duplication is variable, ranging from 238 bp in rice to 311 bp in Brachypodium. Insertions in rpoC2 (~25 K, Fig. 2) have been described and used previously in phylogenetic analyses [, and references therein] and will not be discussed here.
Rice and wheat have identical and independently derived deletions
To confirm that the 2,131-bp deletion in rice and wheat was not an artifact of the alignment or missing sequence, we used the Brachypodium sequence missing in wheat and rice and blasted it against grass sequence databases. We recovered sequences from many grasses except wheat and rice, confirming the presence of the deletion in their genomes. In addition, we searched the GenBank angiosperm databases with the maize sequence corresponding to the deleted wheat and rice region and found that the region is present in species representing diverse lineages of flowering plants, including the monocot Dioscorea, the early-diverging angiosperms Amborella and Nymphaea, and several core eudicots (data not shown). Therefore, we concluded that the 2,131-bp deletions in the wheat and rice chloroplast genomes are derived characters that arose independently in those species.
The 2,131-bp deletions in rice and wheat are identical in both IRs and the sequences bordering them align unambiguously with those of other grasses (Fig. 4). In addition, the lack of direct short repeats in sequences indicates that recombination via short repeats is not the way by which they arose. Thus, despite the fact that deletions of varying lengths in the ndhB-trnI region seem to be common in the BEP clade, the mechanism underlying these specific deletions remains unclear. In tobacco, nucleotide mutations in plastid coding sequences are quickly eliminated by gene conversion, a process facilitated by the polyploid nature of the plastid genome . Whatever the mechanism is that generates deletions in the trnI-ndhB region in species of the BEP clade, their multiple occurrences suggests that they may provide a selective advantage to those species in order to overcome gene conversion and become fixed in the population.
We thank Naxin Huo for her help with BAC end sequencing. This work was supported in parts by the United State Department of Agriculture, Agriculture Research Service CRIS projects 532502100-010 and 532502100-011.
- Staehelin LA, Newcomb EH: Membrane structures and membranous organelles. Biochemistry and Molecular Biology of Plants. Edited by: Buchanan BB, Gruissem W, Jones RL. 2000, Rockville, MD: American Society of Plant Biologists, 37-45.Google Scholar
- Palmer JD: Plastid chromosomes: structure and evolution. The molecular biology of plastids Cell culture and somatic cell genetics of plants. Edited by: Hermann RG. 1991, Vienna: Springer, 7A: 5-53.Google Scholar
- Bock R: Plastid biotechnology: prospects for herbicide and insect resistance, metabolic engineering and molecular farming. Current Opinion in Biotechnology. 2007, 18: 100-106. 10.1016/j.copbio.2006.12.001.View ArticlePubMedGoogle Scholar
- Garvin DF, Gu YQ, Hasterok R, Hazen SP, Jenkins G, Mockler TC, Mur LAJ, Vogel J: Development of genetic and genomic research resources for Brachypodium distachyon, a new model system for grass crop research. The Plant Genome [A Supplement to Crop Science]. 2008, 1: S-69-84.Google Scholar
- Vogel J, Gu YQ, Twigg P, Lazo G, Laudencia-Chingcuanco D, Hayden DM, Donze TJ, Vivian LA, Stamova B, Coleman-Derr D: EST sequencing and phylogenetic analysis of the model grass Brachypodium distachyon. Theoretical and Applied Genetics. 2006, 113 (2): 186-195. 10.1007/s00122-006-0285-3.View ArticlePubMedGoogle Scholar
- Huo N, Gu YQ, Lazo G, Vogel J, Coleman-Derr D, Luo M-C, Thilmony R, Garvin DF, Anderson OD: Construction and characterization of two BAC libraries from Brachypodium distachyon, a new model for grass genomics. Genome. 2006, 49: 1099-1108. 10.1139/G06-087.View ArticlePubMedGoogle Scholar
- Hasterok R, Draper J, Jenkins G: Laying the cytotaxonomic foundations of a new model grass, Brachypodium distachyon (L.) Beauv. Chromosome Research. 2004, 12: 397-403. 10.1023/B:CHRO.0000034130.35983.99.View ArticlePubMedGoogle Scholar
- Jenkins G, Hasterok R: BAC 'landing' on chromosomes of Brachypodium distachyon for comparative genome alignment. Nature Protocols. 2007, 2: 88-98. 10.1038/nprot.2006.490.View ArticlePubMedGoogle Scholar
- Hasterok R, Marasek A, Donnison IS, Armstead I, Thomas A, King IP, Wolny E, Idziak D, Draper J, Jenkins G: Alignment of the genomes of Brachypodium distachyon and temperate cereals and grasses using bacterial artificial chromosome landing with fluorescence in situ hybridization. Genetics. 2006, 173: 349-362. 10.1534/genetics.105.049726.PubMed CentralView ArticlePubMedGoogle Scholar
- Vogel J, Garvin DF, Leong O, Hayden DM: Agrobacterium-mediated transformation and inbred line development in the model grass Brachypodium distachyon. Plant Cell, Tissue and Organ Culture. 2005, 84: 199-211.View ArticleGoogle Scholar
- Christiansen P, Andersen CH, Didion T, Folling M, Nielsen KK: A rapid and efficient transformation protocol for the grass Brachypodium distachyon. Plant Cell Reports. 2005, 23: 751-758. 10.1007/s00299-004-0889-5.View ArticlePubMedGoogle Scholar
- Huo N, Lazo G, Vogel J, You FM, Ma Y, Hayden DM, Coleman-Derr D, Hill TA, Dvorak J, Anderson OD, et al: The nuclear genome of Brachypodium distachyon: analysis of BAC end sequences. Functional and Integrative Genomics. 2007, electronic version.Google Scholar
- GPWG: Phylogeny and subfamilial classification of the grasses (Poaceae). Annals of the Missouri Botanical Garden. 2001, 88 (3): 373-457. 10.2307/3298585.View ArticleGoogle Scholar
- Huelsenbeck JP, Ronquist F: MRBAYES: Bayesian inference of phylogeny. Bioinformatics. 2001, 17: 754-755. 10.1093/bioinformatics/17.8.754.View ArticlePubMedGoogle Scholar
- Nei M, Kumar S: Molecular Evolution and Phylogenetics. 2000, Oxford: Oxford Universiy PressGoogle Scholar
- Saski C, Lee S-B, Fjellheim S, Guda C, Jansen R, Luo H, Tomkins J, Rognli OA, Daniell H, Clarke JL: Complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor, and Agrostis stolonifera, and comparative analyses with other grass genomes. Theoretical and Applied Genetics. 2007, 112 (8): 1503-1518.Google Scholar
- Matsuoka Y, Yamazaki Y, Ogihara Y, Tsunewaki K: Whole chloroplast genome comparison of rice, maize, and wheat: implications for chloroplast gene diversification and phylogeny of cereals. Molecular Biology and Evolution. 2002, 19 (12): 2084-2091.View ArticlePubMedGoogle Scholar
- Tajima F: Simple methods for testing the evolutionary clock hypothesis. Genetics. 1993, 135: 599-607.PubMed CentralPubMedGoogle Scholar
- Doyle JJ, Davis JI, Soreng RJ, Garvin DF, Anderson MJ: Chloroplast DNA inversions and the origin of the grass family (Poaceae). Proceedings of the National Academy of Sciences. 1992, 89: 7722-7726. 10.1073/pnas.89.16.7722.View ArticleGoogle Scholar
- Yamane K, Yano K, Kawahara T: Pattern and rate of indel evolution inferred from whole chloroplast intergenic regions in sugarcane, maize, and rice. DNA Research. 2006, 13: 197-204. 10.1093/dnares/dsl012.View ArticlePubMedGoogle Scholar
- Ogihara Y, Isono K, Kojima T, Endo A, Hanaoka M, Shiina T, Terachi T, Utsugi S, Murata M, Mori N, et al: Structural features of a wheat plastome as revealed by complete sequencing of chloroplast DNA. Molecular Genetics and Genomics. 2002, 266: 740-746. 10.1007/s00438-001-0606-9.View ArticlePubMedGoogle Scholar
- Khakhlova O, Bock R: Elimination of deleterious mutations in plastid genomes by gene conversion. The Plant Journal. 2006, 46: 85-94. 10.1111/j.1365-313X.2006.02673.x.View ArticlePubMedGoogle Scholar
- Mayor C, Brudno M, Schwartz JR, Poliakov A, Rubin EM, Frazer KA, Pachter LS, Dubchak I: VISTA: Visualizing global DNA sequence alignments of arbitrary length. Bioinformatics. 2000, 16: 1046-10.1093/bioinformatics/16.11.1046.View ArticlePubMedGoogle Scholar
- Swofford DL: PAUP*. Phylogenetic analysis using parsimony (*and other methods), version 4. 2003, Sunderland, Massachusetts, USA: SinauerGoogle Scholar
- Nei M, Gojobori T: Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Molecular Biology and Evolution. 1986, 3 (5): 418-426.PubMedGoogle Scholar