Orthology between genomes of Brachypodium, wheat and rice

Background In the past, rice genome served as a good model for studies involving comparative genomics of grass species. More recently, however, Brachypodium distachyon genome has emerged as a better model system for genomes of temperate cereals including wheat. During the present study, Brachypodium EST contigs were utilized to resolve orthologous relationships among the genomes of Brachypodium, wheat and rice. Findings Comparative sequence analysis of 3,818 Brachypodium EST (bEST) contigs and 3,792 physically mapped wheat EST (wEST) contigs revealed that as many as 449 bEST contigs were orthologous to 1,154 wEST loci that were bin-mapped on all the 21 wheat chromosomes. Similarly 743 bEST contigs were orthologous to specific rice genome sequences distributed on all the 12 rice chromosomes. As many as 183 bEST contigs were orthologous to both wheat and rice genome sequences, which harbored as many as 17 SSRs conserved across the three species. Primers developed for 12 of these 17 conserved SSRs were used for a wet-lab experiment, which resolved relatively high level of conservation among the genomes of Brachypodium, wheat and rice. Conclusion The present study confirmed that Brachypodium is a better model than rice for analysis of the genomes of temperate cereals like wheat and barley. The whole genome sequence of Brachypodium, which should become available in the near future, will further facilitate greatly the studies involving comparative genomics of cereals.


Background
Cereals constitute the most important group of cultivated plants, and are known to have diverged from a common paleopolyploid ancestor ~45-47 million years ago (Mya) [1]. Despite this, a remarkable overall structural and functional similarity exists among different cereal genomes [2,3], although the size of these genomes differs greatly, ranging from 430 Mb in rice (Oryza sativa) to 16,000 Mb in hexaploid wheat (Triticum aestivum). Due to its small size and availability of whole genome sequence, rice has been used as a model system for a variety of experimental studies including map-based cloning [4]. However, recent studies resolved further the dynamic changes in rice genome sequences, thus questioning the utility of rice as a model crop [5], and necessitating the need for search of a more efficient model system.
Brachypodium distachyon, a small temperate grass (subfamily Pooideae) has recently emerged as a better model system for the study of temperate grasses. This is particularly, due to several of its desirable biological features and its phylogenetic position [6,7]. It is postulated that rela-tive to rice genome, Brachypodium genome will exhibit a much higher level of colinearity and synteny to the genomes of temperate cereal crops. In the present study, the available Brachypodium EST contigs (bEST contigs) and supercontigs were utilized to explore further the utility of the Brachypodium genome as a model for carrying out comparative genomics studies in cereals in general, and for wheat genomics in particular. The relationship of Brachypodium genome with wheat and rice genomes has been examined for this purpose, and improved criteria of sequence similarity search were used for more accurate estimation of similarity [8].

Results
In the present study, EST sequences from Brachypodium were utilized to find out the degree of similarity of Brachypodium genome with EST/genomic sequences of wheat and rice. The orthologous wheat sequences thus identified were also utilized to study the relationship of wheat genome sequences with Brachypodium supercontigs. We have also taken note of the comparisons of chloroplast genomes among eight grass species, which were included in the report on Brachypodium chloroplast genome sequence that was recently worked out [9].

Orthology between Brachypodium and wheat
As many as 3,818 B. distachyon EST contigs were blasted (BLASTN) against the available wheat EST contigs (containing bin-mapped wESTs) to identify matching wESTs. The analysis revealed that as many as 449 bEST contigs had orthologs in wheat genome.

Analysis of mapped wEST contigs that matched bEST contigs
The above 449 bEST contigs were homologous with a corresponding number of wESTs carrying 1,154 bin-mapped loci or regions giving an average of 2.57 loci per wEST contig (Figures 1, 2). The distribution of ortholoci on the three wheat sub-genomes (A, B and D) and among the seven homoeologous groups of chromosomes (Table 1) was non-random (P << 0.05), when the known chromosome lengths and their DNA contents were used as the basis [10]. The distribution of ortholoci on long and short arms of the chromosomes (excluding 37 loci, which could not be assigned to individual arms) was also non-random (P < 0.05). This non-random distribution of ortholoci is, however, based on limited data.
Of the 1,154 orthologous loci with known positions on wheat chromosomes, 1,094 (94.8%) loci were known to have earlier been assigned to 159 chromosome bins defined by deletion break points. The remaining 60 (5.2%) loci could be assigned only to individual chromosomes or their arms. A maximum of 386 loci (35.3%) were mapped in the proximal regions (60% of the arm length from centromere; C-0.60) followed by 331 loci (30.3%) mapped to the distal regions (40% terminal arm length; 0.60-1.00). The remaining 377 loci (34.4%) were mapped to the interstitial bins having proximal and distal regions.
The above 449 mapped wheat orthologs were also used for homology search among Brachypodium supercontigs. The wheat EST contigs located on homoeologous group 4 chromosomes had maximum homology (54.5% of mapped contigs) with the Brachypodium super_1 contig. In contrast, Brachypodium super_0 to 2 contigs had homology with wEST contigs dispersed on all the seven homoeologous groups, although no redundancy for wheat homologues was observed within the above supercontigs ( Table 2).

Orthology between bEST contigs and rice genome sequences
The BLASTN results of 3,818 bEST contigs against the rice genome sequences identified as many as 743 matching bEST contigs (see methods), which had homologues distributed on all the 12 chromosomes of rice. On the basis of relative length (Mb) of chromosomes and their arms [11], the ortholoci on 12 rice chromosomes/arms were non-randomly distributed (P <<< 0.05) (Table 3; Figure 3).

Conserved orthologous sequences among Brachypodium, wheat and rice
In the present study, 183 orthologous sequences were conserved among all the three species (Brachypodium,  Distribution of orthologous bEST contigs (BdC) on wheat chromosomes belonging to homoeologous groups 1 to 4 (12 chro-mosomes) Figure 1 Distribution of orthologous bEST contigs (BdC) on wheat chromosomes belonging to homoeologous groups 1 to 4 (12 chromosomes). bEST contigs are shown on the right and arm fraction lengths are given on the left. Vertical lines on the right, covering an arm, means that the corresponding bEST contig (shown in bold) could not be assigned to a specific bin and was assigned to the arm; vertical lines covering more than one bins means that corresponding wEST was earlier mapped to a 'combined bin', rather than to an individual bin. The bEST contigs, which could not be assigned to bins and were assigned to individual chromosomes (with no information about arm), are listed at the bottom of each such individual chromosome.
Distribution of orthologous bEST contigs (BdC) on wheat chromosomes belonging to homoeologous groups 5 to 7 (9 chro-mosomes) Figure 2 Distribution of orthologous bEST contigs (BdC) on wheat chromosomes belonging to homoeologous groups 5 to 7 (9 chromosomes). bEST contigs are shown on the right and arm fraction lengths are given on the left. Vertical lines on the right, covering an arm, means that the corresponding bEST contig (shown in bold) could not be assigned to a specific bin and was assigned to the arm; vertical lines covering more than one bins means that corresponding wEST was earlier mapped to a 'combined bin', rather than to an individual bin. The bEST contigs, which could not be assigned to bins and were assigned to individual chromosomes (with no information about arm), are listed at the bottom of each such individual chromosome. Distribution of orthologous bEST contigs (BdC; shown on the right side) on 12 rice chromosomes Figure 3 Distribution of orthologous bEST contigs (BdC; shown on the right side) on 12 rice chromosomes.
wheat and rice). As many as 126 of the 183 orthologous sequences also confirmed known homology between wheat-rice chromosomes. Functional annotation of these 183 orthologous sequences suggested that a majority (137; 74.8%) of these bEST contigs matched with proteins of known functions (see Additional file 1; Figure 4).

Conservation of SSRs among the three genomes
The 183 bEST contig sequences shared by three species (Brachypodium, wheat and rice) were also used for mining SSRs. A total of 100 (54.6%) bEST contigs contained 137 SSRs. As many as 45 of these SSRs showed conservation in wheat and 23 of these SSRs showed conservation in rice. As many as 17 SSRs were conserved across all the three species.

Transferability of conserved orthologous SSRs
In order to validate experimentally the conservation of Brachypodium SSRs among the genomes of wheat and rice, primer pairs for SSRs belonging to 12 orthologs were synthesized and used for PCR amplification of the SSRs (Table 4). All the 12 primer pairs gave amplification products in wheat and rice ( Figure 5).

Discussion
Comparative genomics among grasses initially focused on the analysis of colinearity (gene order) and synteny (gene content) among DNA markers mapped on individual chromosome at a low resolution (10 cM). This led to the identification of 30 rice-independent linkage blocks involved in the constitution of all cereal genomes and allowed identification of a number of rearrangements within individual genomes [12]. However, due to the availability of whole genome sequence of rice, and substantial partial sequences from other cereal genomes, emphasis shifted to a comparison of nucleic acid sequences. In particular, sequences of ~7000 bin-mapped wESTs were aligned with rice genome sequences [13], allowing improved resolution and discovery of many more rearrangements.
Although rice worked well as a model for all grasses including wheat, and generated useful information, Brachypodium, belonging to subfamily Pooideae (wheat also belongs to Pooideae), is proposed as a better model than rice (subfamily Ehrhartoideae). Recent studies have suggested that relative to rice, Barchypodium is more closely related to wheat and barley and the colinearity between Barchypodium and wheat is better than that between wheat and rice [14,15]. Chloroplast sequencebased phylogenetic analysis in eight grass species also suggested that Brachypodium is closer to the tribe Triticeae [9]. The possible estimated time of divergence between Brachypodium and Triticeae is also shorter (35 Mya) than that of divergence between wheat and rice (50 Mya) [16] thus supporting the view that Brachypodium is more closely related with the members of Triticeae.
During the present study, orthologous relationship among bEST contigs, wEST contigs and rice genome sequences was studied using improved criteria of sequence comparison. Observation of higher number of bEST contigs showing orthology with rice genome was A representative pattern of Brachypodium SSR marker PCR products showing conservation and cross-transferability in the genomes of wheat and rice  (Table 4).  mainly attributed to the fact that only a small fraction of wheat genome (0.02%) and almost complete rice genome (95%) were used for sequence comparison with the available Brachypodium EST contigs. If we take into account the proportion of the genome used for comparison, it may be concluded that wheat has higher level of orthology with Brachypodium than with rice.
The mapped loci in different deletion bins of a particular chromosome of wheat matched with same or different supercontigs of Brachypodium. For instance, wheat group 4 chromosomes are highly syntenic to Brachypodium super_1 contig (54.5%) than to other supercontigs, although super_1 contig showed homology with other homoeologous groups also. The mapping information of these Brachypodium supercontigs on individual Brachypodium chromosomes will be useful for developing markers specific to the targeted regions of wheat chromosomes.
It was also observed that although D sub-genome of wheat is smaller in size, the orthologous loci mapped on this sub-genome are no fewer than those mapped on subgenome B, suggesting closer relationship between Brachypodium and Aegilops tauschii, the donor of the D subgenome of hexaploid wheat.
The relative abundance of orthologous loci on proximal regions of chromosome arms in wheat is in agreement with the earlier studies in wheat and rice [17]. It seems that higher degree of sequence conservation coincides with the low recombination proximal regions, which is understandable, since high recombination in terminal regions will cause reshuffling of genes during evolution [18].

Conclusion
The results of the present study indicate that the availability of whole genome sequence of Brachypodium will be of enormous relevance for comparative genomics, gene annotation and evolutionary, structural and functional genomic studies of large genomes of the Triticeae.

Brachypodium, wheat, rice sequence databases
A total of 3,818 Brachypodium EST (bEST) contigs, and a set of 1,015 supercontigs representing 4× coverage of Brachypodium genome, were available in public domain [19,20]. As many as 3,792 wheat EST (wEST) contigs containing bin-mapped wESTs were available at GrainGenes 2.0 [21] and rice genomic sequences were available at Gramene [22].

Sequence comparisons
In order to find orthology among Brachypodium, wheat and rice genomes, bEST contigs were blasted against wEST contigs and rice genomic sequences. The pairwise sequence alignment in BLASTN search was improved by

Mapping of wheat and rice orthologs
The physical positions of wEST orthologs identified through sequence comparisons were localized to specific bins of wheat chromosomes based on the information about mapped wEST sequences [23]. The rice genomic sequences, which were orthologous to bEST contigs, were also known and were physically localized to specific sites on 12 different rice chromosomes with the help of Karyo-View program [24]. The χ 2 test for goodness-of-fit was used for testing the random distribution of ortholoci in wheat genome at the level of the three sub-genomes, the seven homoeologous groups, the 21 chromosomes and the 42 chromosome arms. The same was done for the 12 chromosomes of rice.

Assignment of putative function to orthologs
The orthologous sequences belonging to the three genomes (Brachypodium, wheat and rice) were subjected to BLASTX analysis against non-redundant protein database [25] for assigning putative functions at a cut-off E value of 10 -30 .

Identification of SSRs in orthologs
The orthologous sequences available in all the three genomes were mined for simple sequence repeats (SSRs) using SSRIT program [26]. The SSRs with a repeat motif of 2-6 nucleotides and a length of ≥ 12 bp were included in the analysis. Primers were designed for the 12 conserved SSRs using PRIMER3 [27].

Wet-lab analysis
Primers for 12 conserved Brachypodium SSRs were synthesized from Invitrogen, USA. PCR was performed separately using the genomic DNA of Brachypodium, wheat and rice in a final volume of 20 μl in an Applied Biosystems 'Veriti Thermal Cycler'. After electrophoresis, polyacrylamide gels were silver stained following Tegelstrom [28].