Comparative genomic sequence analysis of strawberry and other rosids reveals significant microsynteny
© Jung et al; licensee BioMed Central Ltd. 2010
Received: 3 February 2010
Accepted: 16 June 2010
Published: 16 June 2010
Fragaria belongs to the Rosaceae, an economically important family that includes a number of important fruit producing genera such as Malus and Prunus. Using genomic sequences from 50 Fragaria fosmids, we have examined the microsynteny between Fragaria and other plant models.
In more than half of the strawberry fosmids, we found syntenic regions that are conserved in Populus, Vitis, Medicago and/or Arabidopsis with Populus containing the greatest number of syntenic regions with Fragaria. The longest syntenic region was between LG VIII of the poplar genome and the strawberry fosmid 72E18, where seven out of twelve predicted genes were collinear. We also observed an unexpectedly high level of conserved synteny between Fragaria (rosid I) and Vitis (basal rosid). One of the strawberry fosmids, 34E24, contained a cluster of R gene analogs (RGAs) with NBS and LRR domains. We detected clusters of RGAs with high sequence similarity to those in 34E24 in all the genomes compared. In the phylogenetic tree we have generated, all the NBS-LRR genes grouped together with Arabidopsis CNL-A type NBS-LRR genes. The Fragaria RGA grouped together with those of Vitis and Populus in the phylogenetic tree.
Our analysis shows considerable microsynteny between Fragaria and other plant genomes such as Populus, Medicago, Vitis, and Arabidopsis to a lesser degree. We also detected a cluster of NBS-LRR type genes that are conserved in all the genomes compared.
The NBS-LRR family is the largest class of Resistance genes (R gene). In addition to the genetically cloned R gene loci, a large number of R gene analogs (RGAs) have been isolated from various plant species . In grass, interspecific analyses have shown that the R genes are frequently found in non-syntenic positions unlike other genes, suggesting rapid reorganization of R genes . In Solanaceae, however, conserved syntenic R genes have been described .
Results and Discussion
Microsynteny between F. vesca and other plant model genomes
Microsyntenic regions between Fragaria fosmids and P. trichocarpa, V. vinifera, M. truncatula or A. thaliana
VI (6); XVIII (6)
VIII (3); X (3)
Chr1 (3); Chr13 (4); Chr18 (3)
XII (3); XIV (3);
Chr15 (3); Chr16_R (3)
VI (4); XVIII (3)
ChrUR (4); ChrUR (3)
Chr5 (3); Chr8 (4)
VII (3); scaffold_158 (3)
IV (5); XI (6); scaffold_64 (3)
Chr1 (4); Chr7 (4)
XII (4); XIV (3); XV (3)
IV (4); XI (4)
VIII (7); X (5)
Chr13 (4); Chr6 (3)
Chr3 (3); Chr3 (3)
II (4); V (3)
Number and the length of the syntenic regions between Fragaria fosmids and the model genomes
# of syntenic regions (# of fosmid with synteny)
# gene pairs (#syntenic regions)
Length* (Model genomes)
3 (22), 4(9), 5(2), 6(3), 7(1)
9.9 -- 86.3 kb
6.7 - 35.3 kb
3(16), 4(4), 5(3)
13.8 - 142.8 kb
6.6 - 32.4 kb
3 (8), 4(4), 5(1)
8.4 -- 55.5 kb
9.8 - 29.7 kb
3 (7), 4(2)
7.1 - 11.7 kb
11.8 - 27.2 kb
Detection of putative orthologs of NBS-LRR cluster
Protein domains in RGAs and R genes that are predicted by InterProScan
Predicted Protein Domain
NB-ARC, LRR, RPW8, DISEASERSIST
NB-ARC, LRR, RPW8
NB-ARC, LRR, DISEASERSIST
LRR, RPW8, DISEASERSIST
34E24_7 has two NB-ARC and three LRR domains. 34E24_7 also has two RPW8 domains in addition to these typical domains of NBS-LRR genes, one at the N-terminal of each NB-ARC domain. The Arabidopsis RPW8 gene, a representative of the most recently characterized class of R genes, is a small, probable membrane protein with no other homology to known proteins and it confers broad-spectrum mildew resistance [18, 19]. We detected genes that have a similar structure to 34E24_7, containing NB-ARC and LRR domains in addition to RPW8 domains, in all the genomes compared: two Arabidopsis genes, AT5G66900 and AT5G66910, one Medicago gene, CU137666_10, one grape gene, GSVIVP00003147001, and one Populus gene, proteinId_563015. This Medicago gene has recently been reported as one of the NB-ARC genes with atypical domain structure due to the fused RPW8 domain . We also detected two grape genes with both NB-ARC and RPW8 domains but without LRR domains. The majority of the R genes in the different species clusters had NB-ARC and LRR domains without RPW8, which is characteristic of the largest class of R genes.
The Arabidopsis and Populus RGA clusters also contained genes with LRR or fragmented NBS domain without the intact NB-ARC: proteinId_76154 and AT5G66630 with fragmented NBS, proteinId_76154 with LRR and RPW8 domains, and AT5G66630 with a RPW8 domain. A previous study  has shown that AT5G66630 contains a zinc-finger domain and clusters with other zinc-finger domain containing genes, but it is fused with the NBS-like domain. The study also reports that the NB-ARC like domain of AT5G66630 is related most closely to a nearby cluster of NBS genes, one of which (AT5G66890) is lacking the NBS region, suggesting a translocation of this domain . In our analysis, the RGA cluster in Fragaria matched to both the AT5G66630 and the nearby cluster of R genes including AT5G66890 (Figure 3).
One interesting observation was the occurrence of the LRR-only genes in the NBS-LRR gene clusters of several plant genomes. LRR domains are found in numerous proteins with various functions and are usually involved in protein-protein interaction  and they are considered to be responsible for R specificity . Two classes of R genes, the tomato Cf-X genes  and the rice Xa21 , encode transmembrane proteins with extracellular LRRs. The frequent existence of the NBS fragments without LRR domains prompted a suggestion that they may encode adaptor molecules that are important in signaling . Similarly, the existence of the LRR-only genes may suggest their functional importance in the disease-resistance mechanism involving NBS-LRR R genes.
Phylogeny analysis of the NBS-LRR genes in the clusters
We report the result from our comparative genomic sequence analysis of Fragaria and other rosids. Considerable microsynteny was detected between Fragaria and other plant genomes such as Populus, Medicago, and Vitis, and Arabidopsis to a lesser degree. The unexpectedly high level of synteny between Fragaria and Vitis and the low level of synteny between Fragaria and Arabidopsis suggest that the stability of genomes, in addition to the evolutionary distance, is important in synteny conservation. We also detected a cluster of NBS-LRR type R genes in all rosids analyzed in this study. The clusters included R genes with unusual domain structure such as NBS only, LRR only and NBS-LRR genes with RPW8. The phylogeny analyses showed that the NBS-LRR genes belong to CNL-A type.
Data Acquisition and Detection of Conserved Syntenic Regions
The 50 Fragaria vesca fosmid sequences [13, 14], were downloaded from NCBI. Results of detailed analysis of the fosmids, including fosmid construction, sequencing, and identification of genetic elements are summarized in two publications [13, 14]. We performed gene predictions using fgenesh and the Medicago (rosid I) trained gene set [Additional file 1], since the predicted gene sets [13, 14] were not available at the time of analysis. The protein data of Arabidopsis, Populus, Vitis, and Medicago were downloaded from the web sites of TAIR , JGI , Genoscope , and http://www.medicago.org, respectively.
The predicted protein sequences of the Fragaria fosmids were compared with the Medicago, Populus, Arabidopsis, and Vitis proteins by pairwise comparison using the BLASTP program. The top ten matches with an E value less than 1e -10 were used for further analysis. Syntenic groups with at least three gene pairs were selected when the distance between the two adjacent matches were less than 200 kb, using DAGchainer , as described before .
Detection of Domains and Phylogeny Analysis of NBS genes
The clusters of genes that matched to the cluster of genes in the Fragaria fosmid 34E24 were analyzed for known domains using InterProScan at the InterPro Database. The NBS-LRR proteins sequences were aligned using CLUSTAL W  with default parameters for slow/accurate option, available at Kyoto University Bioinformatics Center  and phylogenetic trees were generated using Neighbor Joining method. The NJ tree was bootstrapped (1000). The Arabidopsis sequences used as controls for various subtypes of TNL and CNL  and the Prunus RGAs  were downloaded from NCBI.
List of abbreviations
Resistance Gene Analog
Nucleotide Binding Site
Leucine Rich Repeat
Whole Genome Duplication
Bacterial Artificial Chromosome
Resistance to Powdery Mildew
Apaf-1 R proteins, and CED-4
The Arabidopsis Information Resources
The work was funded by National Science Foundation Plant Genome Program (#0320544 to D.M.); United States Department of Agriculture Cooperative State Research, Education and Extension Service - National Research Initiative - Plant Genome Program (#2005-35300-15452 to A.A.). The fosmid sequences were generated by Dr Thomas Davis as part of the United States Department of Agriculture Cooperative State Research, Education and Extension Service - National Research Initiative - Plant Genome Program Award number 2005-35300-15467.
- Gale MD, Devos KM: Comparative genetics in the grasses. Proc Natl Acad Sci. 1998, 95: 1971-1974. 10.1073/pnas.95.5.1971.PubMed CentralPubMedView ArticleGoogle Scholar
- Vilanova S, Sargent DJ, Arús P, Monfort A: Synteny conservation between two distantly related Rosaceae genomes: Prunus(the stone fruits) and Fragaria (the strawberry). BMC Plant Biol. 2008, 8: 67-10.1186/1471-2229-8-67.PubMed CentralPubMedView ArticleGoogle Scholar
- Simillion C, Vandepoele K, Van Montagu MC, Zabeau M, Van de Peer Y: The hidden duplication past of Arabidopsis thaliana. Proc Natl Acad Sci U S A. 2002, 99: 13627-32. 10.1073/pnas.212522399.PubMed CentralPubMedView ArticleGoogle Scholar
- Tuskan GA, Difazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, Putnam N, Ralph S, Rombauts S, Salamov A, Schein J, Sterck L, Aerts A, Bhalerao RR, Bhalerao RP, Blaudez D, Boerjan W, Brun A, Brunner A, Busov V, Campbell M, Carlson J, Chalot M, Chapman J, Chen GL, Cooper D, Coutinho PM, Couturier J, Covert S, Cronk Q, Cunningham R, Davis J, Degroeve S, Déjardin A, Depamphilis C, Detter J, Dirks B, Dubchak I, Duplessis S, Ehlting J, Ellis B, Gendler K, Goodstein D, Gribskov M, Grimwood J, Groover A, Gunter L, Hamberger B, Heinze B, Helariutta Y, Henrissat B, Holligan D, Holt R, Huang W, Islam-Faridi N, Jones S, Jones-Rhoades M, Jorgensen R, Joshi C, Kangasjärvi J, Karlsson J, Kelleher C, Kirkpatrick R, Kirst M, Kohler A, Kalluri U, Larimer F, Leebens-Mack J, Leplé JC, Locascio P, Lou Y, Lucas S, Martin F, Montanini B, Napoli C, Nelson DR, Nelson C, Nieminen K, Nilsson O, Pereda V, Peter G, Philippe R, Pilate G, Poliakov A, Razumovskaya J, Richardson P, Rinaldi C, Ritland K, Rouzé P, Ryaboy D, Schmutz J, Schrader J, Segerman B, Shin H, Siddiqui A, Sterky F, Terry A, Tsai CJ, Uberbacher E, Unneberg P, Vahala J, Wall K, Wessler S, Yang G, Yin T, Douglas C, Marra M, Sandberg G, Van de Peer Y, Rokhsar D: The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science. 2006, 313: 1596-1604. 10.1126/science.1128691.PubMedView ArticleGoogle Scholar
- Jaillon O, Aury JM, Noel B, Policriti A, Clepet C, Casagrande A, Choisne N, Aubourg S, Vitulo N, Jubin C, Vezzi A, Legeai F, Hugueney P, Dasilva C, Horner D, Mica E, Jublot D, Poulain J, Bruyère C, Billault A, Segurens B, Gouyvenoux M, Ugarte E, Cattonaro F, Anthouard V, Vico V, Del Fabbro C, Alaux M, Di Gaspero G, Dumas V, Felice N, Paillard S, Juman I, Moroldo M, Scalabrin S, Canaguier A, Le Clainche I, Malacrida G, Durand E, Pesole G, Laucou V, Chatelet P, Merdinoglu D, Delledonne M, Pezzotti M, Lecharny A, Scarpelli C, Artiguenave F, Pè ME, Valle G, Morgante M, Caboche M, Adam-Blondon AF, Weissenbach J, Quétier F, Wincker P, French-Italian Public Consortium for Grapevine Genome Characterization: The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature. 2007, 449: 463-7. 10.1038/nature06148.PubMedView ArticleGoogle Scholar
- Velasco R, Zharkikh A, Troggio M, Cartwright DA, Cestaro A, Pruss D, Pindo M, Fitzgerald LM, Vezzulli S, Reid J, Malacarne G, Iliev D, Coppola G, Wardell B, Micheletti D, Macalma T, Facci M, Mitchell JT, Perazzolli M, Eldredge G, Gatto P, Oyzerski R, Moretto M, Gutin N, Stefanini M, Chen Y, Segala C, Davenport C, Demattè L, Mraz A, Battilana J, Stormo K, Costa F, Tao Q, Si-Ammour A, Harkins T, Lackey A, Perbost C, Taillon B, Stella A, Solovyev V, Fawcett JA, Sterck L, Vandepoele K, Grando SM, Toppo S, Moser C, Lanchbury J, Bogden R, Skolnick M, Sgaramella V, Bhatnagar SK, Fontana P, Gutin A, Van de Peer Y, Salamini F, Viola R: A high quality draft consensus sequence of the genome of a heterozygous grapevine variety. PLoS One. 2007, 2: e1326-10.1371/journal.pone.0001326.PubMed CentralPubMedView ArticleGoogle Scholar
- Tang H, Bowers JE, Wang X, Ming R, Alam M, Paterson AH: Synteny and collinearity in plant genomes. Science. 2008, 320: 486-8. 10.1126/science.1153917.PubMedView ArticleGoogle Scholar
- Jung S, Main D, Staton M, Cho I, Zhebentyayeva T, Arús P, Abbott A: Synteny conservation between the Prunus genome and both the present and ancestral Arabidopsis genomes. BMC Genomics. 2006, 7: 81-10.1186/1471-2164-7-81.PubMed CentralPubMedView ArticleGoogle Scholar
- Jung S, Jiwan D, Cho I, Lee T, Abbott A, Sosinski B, Main D: Synteny of Prunus and other model plant species. BMC Genomics. 2009, 10: 76-10.1186/1471-2164-10-76.PubMed CentralPubMedView ArticleGoogle Scholar
- Lalli DA, Decroocq V, Blenda AV, Schurdi-Levraud V, Garay L, Le Gall O, Damsteegt V, Reighard GL, Abbott AG: Identification and mapping of resistance gene analogs (RGAs) in Prunus: a resistance map for Prunus. Theor Appl Genet. 2005, 111: 1504-13. 10.1007/s00122-005-0079-z.PubMedView ArticleGoogle Scholar
- Leister D, Kurth J, Laurie DA, Yano M, Sasaki T, Devos K, Graner A, Schulze-Lefert P: Rapid reorganization of resistance gene homologues in cereal genomes. Proc Natl Acad Sci USA. 1998, 95: 370-5. 10.1073/pnas.95.1.370.PubMed CentralPubMedView ArticleGoogle Scholar
- Pan Q, Liu YS, Budai-Hadrian O, Sela M, Carmel-Goren L, Zamir D, Fluhr R: Comparative genetics of nucleotide binding site-leucine rich repeat resistance gene homologues in the genomes of two dicotyledons: tomato and arabidopsis. Genetics. 2000, 155: 309-22.PubMed CentralPubMedGoogle Scholar
- Pontaroli AC, Rogers RL, Zhang Q, Davis TM, Folta KM, San Miguel P, Bennetzen JL: Gene content and distribution in the nuclear genome of Fragaria vesca. Plant Genome. 2009, 2: 93-101. 10.3835/plantgenome2008.09.0007.View ArticleGoogle Scholar
- Davis TM, Shields ME, Zhang Q, Tombolato-Terzic D, Bennetzen JL, Pontaroli AC, Wang H, Yao Q, Sanmiguel P, Folta KM: An examination of targeted gene neighborhoods in strawberry. BMC Plant Biol. 2010, 10: 81-10.1186/1471-2229-10-81.PubMed CentralPubMedView ArticleGoogle Scholar
- Meyers BC, Kozik A, Griego A, Kuang H, Michelmore RW: Genome-wide analysis of NBS-LRR-encoding genes in Arabidopsis. Plant Cell. 2003, 15: 809-34. 10.1105/tpc.009308.PubMed CentralPubMedView ArticleGoogle Scholar
- Rairdan GJ, Moffett P: Distinct Domains in the ARC Region of the Potato Resistance Protein Rx Mediate LRR Binding and Inhibition of Activation. Plant Cell. 2006, 18: 2082-2093. 10.1105/tpc.106.042747.PubMed CentralPubMedView ArticleGoogle Scholar
- Meyers BC, Dickerman AW, Michelmore RW, Sivaramakrishnan S, Sobral BW, Young ND: Plant disease resistance genes encode members of an ancient and diverse protein family within the nucleotide-binding superfamily. Plant J. 1999, 20: 317-32. 10.1046/j.1365-313X.1999.t01-1-00606.x.PubMedView ArticleGoogle Scholar
- Dangl JL, Jones D: Plant pathogens and integrated defence responses to infection. Nature. 2001, 411: 826-33. 10.1038/35081161.PubMedView ArticleGoogle Scholar
- Xiao S, Ellwood S, Calis O, Patrick E, Li T, Coleman M, Turner JG: Broad-spectrum mildew resistance in Arabidopsis thaliana mediated by RPW8. Science. 2001, 291: 118-20. 10.1126/science.291.5501.118.PubMedView ArticleGoogle Scholar
- Ameline-Torregrosa C, Wang B, O'Bleness MS, Deshpande S, Zhu H, Roe B, Young ND, Cannon SB: Identification and characterization of nucleotide-binding site-leucine-rich repeat genes in the model plant Medicago truncatula. Plant physiology. 2008, 146: 5-21. 10.1104/pp.107.104588.PubMed CentralPubMedView ArticleGoogle Scholar
- Kajava AV: Structural diversity of leucine-rich repeat proteins. J Mol Biol. 1998, 277: 519-27. 10.1006/jmbi.1998.1643.PubMedView ArticleGoogle Scholar
- Thomas CM, Dixon MS, Parniske M, Golstein C, Jones JD: Genetic and molecular analysis of tomato Cf genes for resistance to Cladosporium fulvum. Philos Trans R Soc Lond B Biol Sci. 1998, 353: 1413-24. 10.1098/rstb.1998.0296.PubMed CentralPubMedView ArticleGoogle Scholar
- Song WY, Wang GL, Chen LL, Kim HS, Pi LY, Holsten T, Gardner J, Wang B, Zhai WX, Zhu LH, Fauquet C, Ronald P: A receptor kinase-like protein encoded by the rice disease resistance gene, Xa21. Science. 1995, 270: 1804-6. 10.1126/science.270.5243.1804.PubMedView ArticleGoogle Scholar
- Swarbreck D, Wilks C, Lamesch P, Berardini TZ, Garcia-Hernandez M, Foerster H, Li D, Meyer T, Muller R, Ploetz L, Radenbaugh A, Singh S, Swing V, Tissier C, Zhang P, Huala E: The Arabidopsis Information Resource (TAIR): gene structure and function annotation. Nucleic Acids Res. 2008, D1009-14. 36 Database
- The Joint Genome Institute. [http://genome.jgi-psf.org/Poptr1/Poptr1.download.ftp.html]
- The French-Italian Public Consortium. [http://www.genoscope.cns.fr/vitis]
- Haas BJ, Delcher AL, Wortman JR, Salzberg SL: DAGchainer: a tool for mining segmental genome duplications and synteny. Bioinformatics. 2004, 20: 3643-3646. 10.1093/bioinformatics/bth397.PubMedView ArticleGoogle Scholar
- Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG: Clustal W and Clustal X version 2.0. Bioinformatics. 2007, 23: 2947-8. 10.1093/bioinformatics/btm404.PubMedView ArticleGoogle Scholar
- Kyoto University Bioinformatics Center. [http://align.genome.jp/]
- Judd WS, Olmstead RG: A survey of tricolpate (eudicot) phylogenetic relationships. American Journal of Botany. 2004, 91: 1627-1644. 10.3732/ajb.91.10.1627.PubMedView ArticleGoogle Scholar
- Kozik A, Kochetkova E, Michelmore R: GenomePixelizer-a visualization program for comparative genomics within and between species. Bioinformatics. 2002, 18: 335-336. 10.1093/bioinformatics/18.2.335.PubMedView ArticleGoogle Scholar
- Page RDM: TREEVIEW: An application to display phylogenetic trees on personal computers. Computer Applications in the Biosciences. 1996, 12: 357-358.Google Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.