Medaka: a promising model animal for comparative population genomics
- Yoshifumi Matsumoto†1, 7,
- Hiroki Oota†1Email author,
- Yoichi Asaoka2,
- Hiroshi Nishina2,
- Koji Watanabe3,
- Janusz M Bujnicki4, 5, 6,
- Shoji Oda1,
- Shoji Kawamura1 and
- Hiroshi Mitani1Email author
© Oota et al; licensee BioMed Central Ltd. 2009
Received: 25 March 2009
Accepted: 10 May 2009
Published: 10 May 2009
Within-species genome diversity has been best studied in humans. The international HapMap project has revealed a tremendous amount of single-nucleotide polymorphisms (SNPs) among humans, many of which show signals of positive selection during human evolution. In most of the cases, however, functional differences between the alleles remain experimentally unverified due to the inherent difficulty of human genetic studies. It would therefore be highly useful to have a vertebrate model with the following characteristics: (1) high within-species genetic diversity, (2) a variety of gene-manipulation protocols already developed, and (3) a completely sequenced genome. Medaka (Oryzias latipes) and its congeneric species, tiny fresh-water teleosts distributed broadly in East and Southeast Asia, meet these criteria.
Using Oryzias species from 27 local populations, we conducted a simple screening of nonsynonymous SNPs for 11 genes with apparent orthology between medaka and humans. We found medaka SNPs for which the same sites in human orthologs are known to be highly differentiated among the HapMap populations. Importantly, some of these SNPs show signals of positive selection.
These results indicate that medaka is a promising model system for comparative population genomics exploring the functional and adaptive significance of allelic differentiations.
The accumulation of human genetic polymorphism data provided by sources such as the international HapMap project [1, 2] has revealed a number of SNP sites with markedly different allele frequencies among human populations. Such data make systematic searches for disease-causing or drug-responsive genomic regions possible [3, 4], and the accumulated SNP data can also provide compelling evidence of positive selection during human evolution [5, 6]. An inevitable issue, however, is that mutagenesis and/or crossing-over experiments to elucidate functional differences between alleles at these polymorphic sites are practically impossible in humans. A vertebrate model animal with a broad geographic distribution and documented high genetic polymorphism could serve as a "natural library" of genetic variation in humans for orthologous genes that could be under similar selective pressures.
The medaka (Oryzias latipes) is a notable candidate for such a model animal. This small freshwater fish is found in East Asia with closely related congeneric species broadly distributed throughout Southeast Asia, and it has a long history of use as an experimental animal since the early 20th century. A number of inbred medaka strains have been established, and transgenesis and mutagenesis protocols have been developed, suggesting that medaka has great potential for use in systematic genetic analyses [7–10]. Medaka genome sequences are also available . The greatest advantage of using medaka is its enormous genetic diversity compared to the other fish models (zebrafish, pufferfish, etc.), with the average nucleotide difference of 3.4% between two inbred medaka strains being the highest among any vertebrates thus far documented . In this study, our purpose is to assess the validity of medaka as a useful resource of comparative population genomics.
Japanese medaka (Oryzias latipes) populations consist of four geographical populations. We selected 24 wild-type strains from the Japanese medaka (see Additional file 1) and three closely related congeneric species (O. curvinotus, O. luzonensis and O. celebensis; see Additional file 2). We also examined an inbred strain (Hd-rR) of Southern Japanese origin.
PCR-direct sequence, mRNA extraction and cDNA sequence
The 11 genes examined in this study
Gene ontology "biological process" annotation
alcohol metabolic process
I-kappaB kinase/NF-kappaB cascade
coagulation factor II
regulation of G-protein coupled receptor protein signaling pathway
required for axial rotation and left-right specification
solute carrier family 24, member 5
solute carrier family 30 (zinc transporter), member 9
solute carrier family 45, member 2
opsin 1 (cone pigments), long-wave-sensitive (color blindness, protan)
response to temperature stimulus
Statistical and phylogenetic analysis
Nucleotide sequences were aligned using CLUSTALW . The pairwise dN and dS values among strains of 11 genes were calculated by DnaSP Software (version 4.0) according to the Nei-Gojobori method . Insertions and deletions (indels) were excluded from analysis. For the entire nucleotide sequence of RTTN, the d N-d S and p-values were calculated by MEGA 4  according to the Nei-Gojobori method with statistical significance tested by Z-tests.
Protein structure prediction
The GeneSilico metaserver  was used to predict protein secondary structure and order/disorder, and to carry out fold-recognition (i.e. match the query sequence with structurally characterized templates). Potential phosphorylation sites were predicted using a semi-independent component of the metaserver available at the URL http://genesilico.pl/Phosphoserver/. For the THEA2 protein, the metaserver indicated very high similarity (PCONS score 3.28) of residues 1–360 (human numbering) to known Acyl-CoA hydrolase structures (e.g. 2gvh in the Protein Data Bank) and high similarity of residues 360–607 (PCONS score 2.00) to lipid transfer proteins from the STAR family (e.g. 1ln1 in the PDB). Long regions of intrinsic conformational disorder were predicted for loops connecting structural domains (around residues 160–200 and 340–370). For the RTTN protein, the metaserver identified the α-helical armadillo domain of β-catenin (1i7w in Protein Data Bank) as the best modeling template, in particular for residues 1–120, with a high confidence score (PCONS score 1.67). Long regions of structural disorder, devoid of secondary and tertiary structure, were predicted for residues 120–160 and 280–370. Three-dimensional structural models of the ordered (i.e. stably folded) parts of THEA2 and RTTN proteins were generated and optimized using the FRankenstein's Monster method . The final models were evaluated as good quality by the PROQ server . The models were expected to exhibit a root mean square deviation to the true structures in the order of 2–4 Å, suggesting that they are sufficiently reliable to make functional predictions at the level of individual amino acid residues. The atomic details of these models, however, must be taken with a grain of salt.
Results and discussion
The d N - d S values (upper diagonal) and the significance (lower diagonal) based on RTTN cDNA (5.8 kb) sequences
Although its exact function is not known, RTTN is reported to be involved in determining the rotation of the body axis and the left-right asymmetry of internal organs during the embryonic development of mice . The conspicuous differentiation of RTTN alleles among human populations also suggests differential natural selection acting on different populations: at a nonsynonymous SNP site (rs3911730) in the RTTN exon 3, the A/A genotype occurs in 90% of Africans, 2% of Europeans and is absent in Asians, while the C/C genotype occurs in 3% of Africans, 80% of Europeans and 100% of Asians.
Previous studies have reported that genes identified in fish through "forward genetic" analysis of phenotypic mutants are involved in forming variations of related phenotypes in humans, e.g. of skin pigmentation [20–24] and epithelial development . Our approach in this study is an extension of these previous studies, as a form of "reverse genetics" of genes that show, as a signature of natural selection acting on them, a prominent level of diversification in the allele frequency among populations with different ecological histories in both fish and humans. We found that out of 11 genes in our analysis, the medaka THEA2 gene has a nonsynonymous polymorphic site at exactly the same position as its ortholog in humans, and the RTTN gene shows signs of population differentiation that can be explained plausibly by natural selection. The aim of our analysis is not to demonstrate evidence of natural selection in medaka, but to indicate that medaka is a marvelous resource as a "natural library" of genetic diversity, and this approach is efficient enough to find candidate genes targeted by natural selection in both humans and medaka. The exact function of the genes and the exact nature of the functional differences between alleles can be studied more feasibly in medaka, where crossing experiments between different genotypes of interest and transgenic techniques have already been established [7, 8]. This method can be applied to any polymorphic gene in humans, and larger-scale and more systematic screening of orthologous gene polymorphisms in medaka will find various target genes for further functional analyses. As the medaka has been widely used for carcinogenesis and ecotoxicological studies , for example, in screening for genetic variants concerning medaka carcinogenesis and ecotoxins, it could also be used for testing variations in drug response in humans. Thus, we conclude that the medaka is a good vertebrate model of the functional diversity caused by human DNA polymorphisms that have been identified by recent resequencing and typing efforts.
This work was supported by a Grant-in-Aid for Scientific Research (A) from the Japan Society for the Promotion of Science (JSPS) (19207018) to SK, by a Grant-in-Aid for Scientific Research (C) from JSPS (19570226) to HO, and by a Grant-in-Aid for Scientific Research in the Priority Area "Comparative Genomics" (#015) from the Ministry of Education, Culture, Sports, Science and Technology of Japan (MEXT) to HM. We thank Professor Emeritus Akihiro Shima and Dr. Atsuko Shimada (the University of Tokyo) for their efforts on keeping medaka stocks from wild populations.
- The_international_HapMap_consortium: A haplotype map of the human genome. Nature. 2005, 437: 1299-1320. 10.1038/nature04226.PubMed CentralView ArticleGoogle Scholar
- Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, Gibbs RA, Belmont JW, Boudreau A, Hardenbol P, Leal SM, et al: A second generation human haplotype map of over 3.1 million SNPs. Nature. 2007, 449: 851-861. 10.1038/nature06258.View ArticlePubMedGoogle Scholar
- Conrad DF, Jakobsson M, Coop G, Wen X, Wall JD, Rosenberg NA, Pritchard JK: A worldwide survey of haplotype variation and linkage disequilibrium in the human genome. Nat Genet. 2006, 38: 1251-1260. 10.1038/ng1911.View ArticlePubMedGoogle Scholar
- McVean G, Spencer CC, Chaix R: Perspectives on human genetic variation from the HapMap Project. PLoS Genet. 2005, 1: e54-10.1371/journal.pgen.0010054.PubMed CentralView ArticlePubMedGoogle Scholar
- Voight BF, Kudaravalli S, Wen X, Pritchard JK: A map of recent positive selection in the human genome. PLoS Biol. 2006, 4: e72-10.1371/journal.pbio.0040072.PubMed CentralView ArticlePubMedGoogle Scholar
- Williamson SH, Hubisz MJ, Clark AG, Payseur BA, Bustamante CD, Nielsen R: Localizing recent adaptive evolution in the human genome. PLoS Genet. 2007, 3: e90-10.1371/journal.pgen.0030090.PubMed CentralView ArticlePubMedGoogle Scholar
- Wittbrodt J, Shima A, Schartl M: Medaka – a model organism from the far East. Nat Rev Genet. 2002, 3: 53-64. 10.1038/nrg704.View ArticlePubMedGoogle Scholar
- Shima A, Mitani H: Medaka as a research organism: past, present and future. Mech Dev. 2004, 121: 599-604. 10.1016/j.mod.2004.03.011.View ArticlePubMedGoogle Scholar
- Naruse K, Hori H, Shimizu N, Kohara Y, Takeda H: Medaka genomics: a bridge between mutant phenotype and gene function. Mech Dev. 2004, 121: 619-628. 10.1016/j.mod.2004.04.014.View ArticlePubMedGoogle Scholar
- Matsumoto Y, Fukamachi S, Mitani H, Kawamura S: Functional characterization of visual opsin repertoire in Medaka (Oryzias latipes). Gene. 2006, 371: 268-278. 10.1016/j.gene.2005.12.005.View ArticlePubMedGoogle Scholar
- Kasahara M, Naruse K, Sasaki S, Nakatani Y, Qu W, Ahsan B, Yamada T, Nagayasu Y, Doi K, Kasai Y, et al: The medaka draft genome and insights into vertebrate genome evolution. Nature. 2007, 447: 714-719. 10.1038/nature05846.View ArticlePubMedGoogle Scholar
- Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994, 22: 4673-4680. 10.1093/nar/22.22.4673.PubMed CentralView ArticlePubMedGoogle Scholar
- Nei M, Gojobori T: Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol. 1986, 3: 418-426.PubMedGoogle Scholar
- Tamura K, Dudley J, Nei M, Kumar S: MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol Biol Evol. 2007, 24: 1596-1599. 10.1093/molbev/msm092.View ArticlePubMedGoogle Scholar
- Kurowski MA, Bujnicki JM: GeneSilico protein structure prediction meta-server. Nucleic Acids Res. 2003, 31: 3305-3307. 10.1093/nar/gkg557.PubMed CentralView ArticlePubMedGoogle Scholar
- Kosinski J, Cymerman IA, Feder M, Kurowski MA, Sasin JM, Bujnicki JM: A "FRankenstein's monster" approach to comparative modeling: merging the finest fragments of Fold-Recognition models and iterative model refinement aided by 3D structure evaluation. Proteins. 2003, 53 (Suppl 6): 369-379. 10.1002/prot.10545.View ArticlePubMedGoogle Scholar
- Wallner B, Elofsson A: Identification of correct regions in protein models using structural, alignment, and consensus information. Protein Sci. 2006, 15: 900-913. 10.1110/ps.051799606.PubMed CentralView ArticlePubMedGoogle Scholar
- Adams SH, Chui C, Schilbach SL, Yu XX, Goddard AD, Grimaldi JC, Lee J, Dowd P, Colman S, Lewin DA: BFIT, a unique acyl-CoA thioesterase induced in thermogenic brown adipose tissue: cloning, organization of the human gene and assessment of a potential link to obesity. Biochem J. 2001, 360: 135-142. 10.1042/0264-6021:3600135.PubMed CentralView ArticlePubMedGoogle Scholar
- Faisst AM, Alvarez-Bolado G, Treichel D, Gruss P: Rotatin is a novel gene required for axial rotation and left-right specification in mouse embryos. Mech Dev. 2002, 113: 15-28. 10.1016/S0925-4773(02)00003-5.View ArticlePubMedGoogle Scholar
- Fukamachi S, Shimada A, Shima A: Mutations in the gene encoding B, a novel transporter protein, reduce melanin content in medaka. Nat Genet. 2001, 28: 381-385. 10.1038/ng584.View ArticlePubMedGoogle Scholar
- Lamason RL, Mohideen MA, Mest JR, Wong AC, Norton HL, Aros MC, Jurynec MJ, Mao X, Humphreville VR, Humbert JE, et al: SLC24A5, a putative cation exchanger, affects pigmentation in zebrafish and humans. Science. 2005, 310: 1782-1786. 10.1126/science.1116238.View ArticlePubMedGoogle Scholar
- Nakayama K, Fukamachi S, Kimura H, Koda Y, Soemantri A, Ishida T: Distinctive distribution of AIM1 polymorphism among major human populations with different skin color. J Hum Genet. 2002, 47: 92-94. 10.1007/s100380200007.View ArticlePubMedGoogle Scholar
- Fukamachi S, Kinoshita M, Tsujimura T, Shimada A, Oda S, Shima A, Meyer A, Kawamura S, Mitani H: Rescue From Oculocutaneous Albinism Type 4 Using Medaka slc45a2 cDNA Driven by Its Own Promoter. Genetics. 2008, 178: 761-769. 10.1534/genetics.107.073387.PubMed CentralView ArticlePubMedGoogle Scholar
- Miller CT, Beleza S, Pollen AA, Schluter D, Kittles RA, Shriver MD, Kingsley DM: cis-Regulatory changes in Kit ligand expression and parallel evolution of pigmentation in sticklebacks and humans. Cell. 2007, 131: 1179-1189. 10.1016/j.cell.2007.10.055.PubMed CentralView ArticlePubMedGoogle Scholar
- Kondo S, Kuwahara Y, Kondo M, Naruse K, Mitani H, Wakamatsu Y, Ozato K, Asakawa S, Shimizu N, Shima A: The medaka rs-3 locus required for scale development encodes ectodysplasin-A receptor. Curr Biol. 2001, 11: 1202-1206. 10.1016/S0960-9822(01)00324-4.View ArticlePubMedGoogle Scholar
- Wright S: Evolution in Mendelian Populations. Genetics. 1931, 16: 97-159.PubMed CentralPubMedGoogle Scholar
- Sabeti PC, Schaffner SF, Fry B, Lohmueller J, Varilly P, Shamovsky O, Palma A, Mikkelsen TS, Altshuler D, Lander ES: Positive natural selection in the human lineage. Science. 2006, 312: 1614-1620. 10.1126/science.1124309.View ArticlePubMedGoogle Scholar
- Han Y, Gu S, Oota H, Osier MV, Pakstis AJ, Speed WC, Kidd JR, Kidd KK: Evidence of positive selection on a class I ADH locus. Am J Hum Genet. 2007, 80: 441-456. 10.1086/512485.PubMed CentralView ArticlePubMedGoogle Scholar
- Oota H, Pakstis AJ, Bonne-Tamir B, Goldman D, Grigorenko E, Kajuna SL, Karoma NJ, Kungulilo S, Lu RB, Odunsi K, et al: The evolution and population genetics of the ALDH2 locus: random genetic drift, selection, and low levels of recombination. Ann Hum Genet. 2004, 68: 93-109. 10.1046/j.1529-8817.2003.00060.x.View ArticlePubMedGoogle Scholar
- Myles S, Tang K, Somel M, Green RE, Kelso J, Stoneking M: Identification and analysis of genomic regions with large between-population differentiation in humans. Ann Hum Genet. 2008, 72: 99-110.PubMedGoogle Scholar