- Project Note
- Open Access
Validation of internal reference genes for quantitative real-time PCR in a non-model organism, the yellow-necked mouse, Apodemus flavicollis
BMC Research Notesvolume 2, Article number: 264 (2009)
Reference genes are used as internal standards to normalize mRNA abundance in quantitative real-time PCR and thereby allow a direct comparison between samples. So far most of these expression studies used human or classical laboratory model species whereas studies on non-model organism under in-situ conditions are quite rare. However, only studies in free-ranging populations can reveal the effects of natural selection on the expression levels of functional important genes. In order to test the feasibility of gene expression studies in wildlife samples we transferred and validated potential reference genes that were developed for lab mice (Mus musculus) to samples of wild yellow-necked mice, Apodemus flavicollis. The stability and suitability of eight potential reference genes was accessed by the programs BestKeeper, NormFinder and geNorm.
Although the three programs used different algorithms the ranking order of reference genes was significantly concordant and geNorm differed in only one, NormFinder in two positions compared to BestKeeper. The genes ordered by their mean rank from the most to the least stable gene were: Rps18, Sdha, Canx, Actg1, Pgk1, Ubc, Rpl13a and Actb. Analyses of the normalization factor revealed best results when the five most stable genes were included for normalization.
We established a SYBR green qPCR assay for liver samples of wild A. flavicollis and conclude that five genes should be used for appropriate normalization. Our study provides the basis to investigate differential expression of genes under selection under natural selection conditions in liver samples of A. flavicollis. This approach might also be applicable to other non-model organisms.
Quantitative real-time RT PCR (qPCR) has become a tool with a broad spectrum of use in molecular biology . By quantifying mRNA levels it allows valuable insights into the variation of gene expression between certain individuals or different treatment groups. The most common practice in qPCR is the relative measurement of the expression of a gene of interest after normalization to an internal reference gene. These formerly called house-keeping genes were thought to be constantly expressed in every cell or every tissue and were supposed to be neither up nor down regulated. This assumption has proven false by a growing number of studies [2–4]. All genes seem to be regulated under some conditions and there seems to be no universal reference gene with a constant expression in all tissues [5–9]. But still the relative quantification against an internal reference gene is the most accurate way to detect expression differences especially in low copy mRNA because it controls for artificial variation, e.g. due to differences in the amount of sample, RNA extraction or reverse transcription efficiency . Thus, a careful validation of the usefulness of potential reference genes is highly recommended [1, 6, 10–15] but not always applied . So far gene expression studies and therefore also reference gene validations are mainly limited to human or classical laboratory organisms as non-model species often suffer from the lack of background information available. For example the real-time PCR primer data base RTPrimerDB  includes 5319 primer sets for animals and humans, whereof 3992 were designed for humans followed by 805 for mice (Mus musculus) and 454 for rats (Rattus norvegicus) commonly used in labs. But particularly non-model species are of great interest to evolutionary genetics or ecologists as classical model species might be poor reflections of wildlife which face the constantly changing and challenging conditions of their natural environment . Focusing just on model species could mean working on the expense of ecological and evolutionary realism and in-situ studies on wild populations are required to account for natural selection conditions.
In this study we established a SYBR green qPCR assay for liver samples obtained from wild caught Apodemus flavicollis. The yellow-necked mouse is a common European murid in deciduous and mixed forests. It belongs to the subfamily Murinae  and has been subject to a broad range of genetic, ecological, evolutionary and parasitological studies [20–25]. Especially host-parasite interactions are of special interest in this species as this species serve as one of the main reservoir for vector-borne diseases agents (e.g. Salmonella spp., Borreliosis or Hanta virus infections) in Central Europe . The results of our study are the prerequisite to investigate the adaptive variance of expression levels of immune genes, specifically major histocompatibility complex class II genes, in relation to individual pathogen burden to test the hypothesis that in a natural environment not only structural sequence variation but also differential expression of adaptive genes is under selection. Therefore, we validated eight potential reference genes from a panel of primer sets that were originally designed for Mus musculus and tested their application for relative gene expression analysis in A. flavicollis.
Results and discussion
Potential reference genes
All 15 tested reference gene primer sets were originally designed for Mus musculus (Table 1). It turned out that none of the six primer sets from the RTPrimer data base  nor the primers for the reference gene B2 m of the Mouse Normalisation Gene Panel (Quantace) did amplify a product in the related non-model species Apodemus flavicollis. Transferring primer sets from closely related organisms limits the set of genes that are tested and might reduce the chance to find a good internal reference as the possible choice depends on the set and number of genes that were used. However, eight intron spanning primer sets of the Mouse Normalisation Gene Panel (Quantace) performed well in A. flavicollis, which still is a comparable number to other validation studies [9, 26–28]. They amplified conserved parts of the succinate dehydrogenase complex (Sdha), γ-actin (Actg1), ribosomal protein S18 (Rps18), ribosomal protein L13a (Rpl13a), phosphoglycerate kinase 1 (Pgk1), calnexin (Canx), β-actin (Actb) and ubiquitin C (Ubc). Further functions and accession numbers are provided in Table 1. As the sequences of the commercial primer sets were unknown we applied molecular cloning and subsequent sequence analysis using the vector primers T7 and M13 to confirm amplicon identity. The GenBank accession numbers are provided in Table 2. All gene identities could be confirmed but Rpl13a turned out to be not intron spanning. Sequencing revealed that the commercial primer set for RPL13a did amplify part of the small nuclear RNA (sno RNA) U35 that is situated in the sixth intron of Rpl13a and part of the seventh exon of Rpl13a.
The average arithmetic mean (AM) of the amplification rate E ranged from 1.82 for Actb to 1.88 for Actg1 (Table 2). The coefficient of variance (CV) expresses the variance of the amplification rate between the different qPCR runs. It was 0.05 for all reference genes except for Actg1 and Rps18 (0.06) (Table 2). The lowest Ct -value recorded was 12.87 cycles and the highest was 28.87 cycles. The difference in the Ct -values between the genes within a run ranged from 9.83 cycles to 14.81 cycles (Table 2).
Identification of optimal reference genes
All our analyses on the stability of the references genes using the different algorithms showed consistent results with only slight differences in the ranking order (Table 3). A Kendall's W test showed a very high concordance of gained orders (Kendall's W = 0.958, χ2 = 20.108, df = 7, p < 0.01). The resulting mean rank order of the genes from low to high variation was Rps18, Sdha, Canx, Actg1, Pgk1, Ubc, Rpl13a and Actb.
The software BestKeeper ranked all genes by their Ct-value variance (low to high): Rps18, Sdha, Canx, Pgk1, Actg1, Ubc, Rpl13a and Actb (Table 2). It considers all genes showing a variation in their amount of starting material by the factor two or more as unstable . In an ideal PCR reaction with an amplification rate of two (100% reaction efficiency) this would be any gene whose Ct-values show a standard deviation SDCt-value> 1, which is used as default by BestKeeper. Hibbeler et al.  already ruled out that the default setting of BestKeeper might be a too strict rule and limits its use to a very restricted experimental setup. In in-vivo samples, it is difficult to achieve a SDCt-value< 1 as whole-tissue biopsies usually represent a composition of different cell types and show therefore a higher variation . Additionally in biological samples the reaction efficiency is rarely 100% . We therefore adjusted the SD-threshold for each gene to its specific efficiency. As a consequence we made BestKeeper more applicable but still rejected every gene whose SDCt-valueindicated a variation in the starting template by the factor two. According to our study the first four genes could be considered as stable reference genes as the SDCt-valuewas lower than their individual SD-threshold whereas the other genes were considered as unstable (Table 2).
The ranking of the computer program NormFinder  is not based on the Ct-values but on the expression values. Compared to the BestKeeper ranking only two changes at the first and the sixth position occurred: Sdha (<0.382, Fig. 1) changed place with Rps18 <0.427) and was the most stable gene while Rpl13a (<0.734) changed place with Ubc (>0.771) and became the sixth most stable gene. However, the five most stable genes differ only by just 0.084 points in their stability values, while the difference among the last three genes is more than three times larger than this (Fig. 1).
The program geNorm  ranks the potential reference genes due to their average pairwise variation in expression of one gene compared to each other gene of the set. It is independent of inter-run variability or different reverse transcription RT efficiencies. Only one change occurred compared to the ranking of BestKeeper: Canx becomes together with Rps18 one of the two most stable genes, which cannot be further ranked (MCanx/Rps18 = 0.73) (Fig. 1). Whereas geNorm is susceptible to identify co-regulated genes as optimal reference genes as they would show a constant ratio, NormFinder and BestKeeper do not suffer from this problem. As all three softwares produce consistent results we assume that the potential problem of co-regulated genes does not apply to our data.
Number of reference genes
The use of just a single reference gene may result in a more than 6-fold erroneous normalization  and it is therefore recommended to use more than one reference gene [1, 30] and calculate a normalization factor (NF) [6, 14]. As Vandesompele et al.  pointed out it is a trade off between accuracy and feasibility, but it seems inappropriate if the number of reference genes exceeds the number of genes of interest by far. To find the optimal number of reference genes for normalization geNorm calculates whether the stepwise inclusion of a less stable gene into the normalization factor NF n affects the variance Vn/n+1compared NFn+1(Fig. 2). We observed the lowest Variation Vn/n+1between inclusion of the fourth and fifth most stable reference gene (V4/5 = 0.164) (Fig. 2). A high Vn/n+1means that the inclusion of the next gene had a big effect and it still should be included into the calculation of an accurate NF. V4/5 = 0.164 is a bit higher than the cut off value of 0.15 suggested by Vandesompele et al. . But this is an empirical value and should not be taken as a too strict cut off value, as it is already suggested by the geNorm manual itself. Although Actg1 was refused as a reference gene by BestKeeper analysis we would suggest to use the first five reference genes Rps18, Canx, Sdha, Pgk1 and Actg1 for calculating a NF in A. flavicollis, as Actg1 only slightly missed the SD-threshold. This is further supported by the results of NormFinder as we observed a clear increase of the stability value between the fifth and the sixth most stable gene. This increase is more then three times as high as the over-all difference between the first and the fifth gene. This shows that the first five genes are much more similar in expression stability than the last three ones.
Although we expected higher expression variability due to more heterogeneity in terms of age or physiological stages in our samples we could show that relative quantification via real-time PCR is feasible in samples from wild caught animals. The five genes Rps18, Canx, Sdha, Pgk1 and Actg1 were most stable and should allow an appropriate normalization factor for accurate measurement. We hope that our study will encourage other researchers to apply qPCR in eco-genomic studies on other wildlife species.
We live trapped wild yellow necked mice (Apodemus flavicollis) in 2007/08 in a deciduous forest about 35 km north-east of Hamburg, Germany. Mice were anesthetized by inhalation of isoflurane (Forene©) and then sacrificed immediately by cervical dislocation at the trapping site. Liver samples were taken and stored in RNA-Later (Sigma), kept at 4°C for 24 h and then frozen at -20°C until further treatment.
RNA extraction and cDNA synthesis
Thirty mg liver tissue of 14 animals were placed in tubes with 500 μl of QIAzol lyses reagent (Qiagen) with 1.4 mm ceramic beads. Tissue was disrupted in a homogenizer (Precellys, Bertin Technologies) (2 × 10 s at 5000 rpm) and total RNA was extracted following the QIAzol lyses reagent protocol and dissolved in 87.5 μl of water. A DNA digestion with DNase I (RNase-free DNase Kit, Qiagen) and a subsequent clean-up via RNeasy spin columns (Qiagen) according to the manufacturer's protocol was done. Total RNA was finally eluted in 60 μl of water and its amount and purity was assessed with the Nanodrop 1000 (Thermo Scientific) three times and averaged. Two μg of total RNA were reverse transcribed with Oligo-dT18 primers (5 μM). Reverse transcription was run in triplicates of 40 μl using the SensiMix two step kit (Quantace) according to the manufacturer's protocol. All RT-triplicates were mixed and the copied cDNA was diluted 1:16 prior qPCR with aqua dest.
We chose six rodents primer sets out of the RTPrimer data base because they potentially amplified reference genes with similar length and identical annealing temperature T a . We also tested nine intron spanning primer sets out of the commercially available Mouse Normalisation Gene Panel (Quantace) (Table 1). All these potential reference gene primer sets were originally designed for the model organism Mus musculus and we applied them to our non-model organism A. flavicollis.
Quantitative real-time RT PCR
SYBR green qPCR was performed with SensiMix two step kit (Quantace) in a 25 μl volume on a Rotor Gene 3000 (Corbett Research). All qPCR reactions were run in triplicates with a no-template control to check for contaminations. Each tube contained 4 μl of cDNA template, 12.5 μl SensiMix dT (Quantace), 0.5 μl SYBR Green solution, 0.5 μl primer (50 μM) and 7.5 μl dH2O. The qPCR conditions were 10 min at 95°C and 45 cycles of each 95° for 15 s, 55°C for 20 s and 72°C for 20 s. Melting curve analysis was performed from 65° to 95°C in 0.5°C steps each lasting 5 s to confirm presence of a single product and absence of primer-dimers. The individual amplification rate E for every single reaction tube was determined by the 'comparative quantification' function (Corbett Software 6.1.81) to avoid inter-run variation. E is defined as the average increase of fluorescence in the raw data for five cycles following the 'Takeoff' value. This Takeoff value is specified as the time at which the second derivative of the raw data is at 20% of its maximum (Corbett Software 6.1.81). This point marks the end of the background noise and indicates the transition into the exponential phase of the reaction. E was averaged for each gene out of the three replicates in each run. To normalize the raw data the individual background fluorescence from cycle one to the Takeoff value was averaged and all data points of a sample were divided by this average background level ('Dynamic Tube' function, Corbett Software 6.1.81). Individual threshold cycle values (Ct-values) were obtained by setting a threshold manually at 0.01 of the normalized fluorescence ignoring the first five cycles. The Ct-values for a gene were averaged for the three replicates in each run. We calculated the expression of each gene arbitrarily as Q = E-Ct. Note that Q is not the real amount of DNA copies N t = N0 *Etto a time point t but rather the fluorescence that is measured proportional to N t . As we set a certain fluorescence threshold we set . With the known E and the Ct-value the ratio between two genes depends only upon their start amount of cDNA N0.
Determination of reference gene expression stability
The stability of the selected reference genes was determined by BestKeeper , NormFinder  and geNorm . Concordance between their different ranking orders was tested with Kendall's W implemented in SPSS 16.0.2.
BestKeeper ranks the reference genes by the variation of their Ct-values. The gene with the lowest standard deviation (SDCt-value) is proposed to be the most suitable reference gene. Like BestKeeper, we excluded every gene showing a SDCt-value that would result in a variation of the starting material by the factor two. But unlike BestKeeper, we calculated this SD-threshold for each gene based on its known over-all run average E: .
NormFinder  instead uses a model based approach to analyse the variance in the expression data. It allows for intra- and intergroup variation which makes it more robust against co-expressed genes. In this experiment it was not necessary to distinguish between intra- and intergroup variation as we had only one group of samples. NormFinder calculates a stability value for each gene and the gene with the lowest value is supposed to be the most stable out of the tested set of genes.
GeNorm  bases on the simple assumption that expression of two ideal reference genes will always have the same ratio among samples regardless of the experimental conditions before the real-time PCR. The ratio between two genes (Y and X) in a sample is . The average expression stability value M for each gene is calculated using the expression data. M is the average pairwise variation of a gene compared with each of the other potential reference genes in one sample. The average M of all genes together is then calculated by stepwise exclusion of the least stable gene until the two most stable genes of the set remain that can not be ranked any further.
GeNorm also allows estimating the optimal number of reference genes which should be used for normalization. It calculates the normalization factor (NF) based on the geometric mean of the expression of more than one reference gene. The more reference genes included in this NF the less possible outliers account. On the other hand using to many genes might include unstable reference genes making it less accurate. GeNorm calculates the NF n for the two most stable reference genes based on the geometric mean of the expression data and then the NFn+1with the next most stable gene. To determine how many genes should be used for accurate normalization the pairwise variation Vn/n+1was determined out of two sequential normalization factors (NF n and NFn+1).
All research reported in this manuscript adhered to the legal requirements of Germany were and complied with the protocols approved by the responsible state office for Agriculture, Environment and Rural Areas of Schleswig-Holstein (Referenz No: LANU 315/53220.127.116.11).
Bustin SA, Benes V, Nolan T, Pfaffl MW: Quantitative real-time RT-PCR -a perspective. J Mol Endocrinol. 2005, 34: 597-601. 10.1677/jme.1.01755.
Selvey S, Thompson EW, Matthaei K, Lea RA, Irving MG, Griffiths LR: β-Actin -an unsuitable internal control for RT-PCR. Mol Cell Probes. 2001, 15: 307-311. 10.1006/mcpr.2001.0376.
Tricarico C, Pinzani P, Bianchi S, Paglierani M, Distante V, Pazzagli M, Bustin SA, Orlando C: Quantitative real-time reverse transcriptionpolymerase chain reaction: normalization to rRNA or single housekeeping genes is inappropriate for human tissue biopsies. Anal Biochem. 2002, 309 (2): 293-300. 10.1016/S0003-2697(02)00311-1.
Bas A, Forsberg G, Hammarström S, Hammarström ML: Utility of the Housekeeping Genes 18S rRNA, β-Actin and Glyceraldehyde-3-Phosphate-Dehydrogenase for Normalization in Real-Time Quantitative Reverse Transcriptase-Polymerase Chain Reaction Analysis of Gene Expression in Human T Lymphocytes. Scand J Immunol. 2004, 59 (6): 566-573. 10.1111/j.0300-9475.2004.01440.x.
Thellin O, Zorzi W, Lakaye B, De Borman B, Coumans B, Hennen G, Grisar T, Igout A, Heinen E: Housekeeping genes as internal standards: use and limits. J Biotechnol. 1999, 75: 291-295. 10.1016/S0168-1656(99)00163-7.
Vandesompele J, De Preter K, Pattyn F, Poppe B, Van Roy N, De Paepe A, Speleman F: Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes. Genome Biol. 2002, 3: 7-10.1186/gb-2002-3-7-research0034.
Kouadjo KE, Nishida Y, Cadrin-Girard JF, Yoshioka M, St-Amand J: Housekeeping and tissue-specific genes in mouse tissues. BMC Genomics. 2007, 8: 127-10.1186/1471-2164-8-127.
Hibbeler S, Scharsack JP, Becker S: Housekeeping genes for quantitative expression studies in the three-spined stickleback Gasterosteus aculeatus. BMC Mol Biol. 2008, 9: 18-10.1186/1471-2199-9-18.
Ahn K, Huh J-W, Park S-J, Kim D-S, Ha H-S, Kim Y-J, Lee JR, Chang K-T, Kim H-S: Selection of internal reference genes for SYBR green qRT-PCR studies of rhesus monkey (Macaca mulatta) tissues. BMC Mol Biol. 2008, 9: 78-10.1186/1471-2199-9-78.
Huggett J, Dheda K, Bustin S, Zumla A: Real-time RT-PCR normalisation; strategies and considerations. Genes Immun. 2005, 6: 279-284. 10.1038/sj.gene.6364190.
Livak KJ, Schmittgen TD: Analysis of Relative Gene Expression Data Using Real-Time Quantitative PCR and the 2-ΔΔCt Method. Methods. 2001, 25: 402-408. 10.1006/meth.2001.1262.
Lindbjerg Andersen C, Jensen JL, Ørntoft TF: Normalization of Real-Time Quantitative Reverse Transcription-PCR Data: A Model-Based Variance Estimation Approach to Identify Genes Suited for Normalization, Applied to Bladder and Colon Cancer Data Sets. Cancer Res. 2004, 64: 5245-5250. 10.1158/0008-5472.CAN-04-0496.
Kubista M, Andrade JM, Bengtsson M, Forootan A, Jonák J, Lind K, Sindelka R, Sjöback R, Sjögreen B, Strömbom L, et al: The real-time polymerase chain reaction. Mol Asp Med. 2006, 27: 95-125. 10.1016/j.mam.2005.12.007.
Pfaffl MW, Tichopad A, Prgomet C, Neuvians T: Determination of stable housekeeping genes, differentially regulated target genes and sample integrity: BestKeeper -Excel-based tool using pair-wise correlations. Biotechnol Lett. 2004, 26: 509-515. 10.1023/B:BILE.0000019559.84305.47.
Szabo A, Perou CM, Karaca M, Perreard L, Quackenbush JF, Bernard PS: Statistical modeling for selecting housekeeper genes. Genome Biol. 2004, 5: R59-10.1186/gb-2004-5-8-r59.
Bustin S: Real-time, flourescence-based quantitative PCR: a snapshot of current procedures and preferences. Expert Review of Molecular Diagnostics. 2005, 5 (4): 493-498. 10.1586/1473718.104.22.1683.
Lefever S, Vandesompele J, Speleman F, Pattyn F: RTPrimerDB: the portal for real-time PCR primers and probes. Nucl Acids Res. 2008, D942-5. 37 Database
Feder ME, Mitchell-Olds T: Evolutionary and ecological functional genomics. Nat Rev Genet. 2003, 4 (8): 649-655. 10.1038/nrg1128.
Michaux JR, Chevret P, Filippucci MG, Macholan M: Phylogeny of the genus Apodemus with a special emphasis on the subgenus Sylvaemus using the nuclear IRBP gene and two mitochondrial markers: Cytochrome b and 12S rRNA. Mol Phylogenet Evol. 2002, 23 (2): 123-136. 10.1016/S1055-7903(02)00007-6.
Meyer-Lucht Y, Sommer S: MHC diversity and the association to nematode parasitism in the yellow-necked mouse (Apodemus flavicollis). Mol Ecol. 2005, 14: 2233-2243. 10.1111/j.1365-294X.2005.02557.x.
Musolf K, Meyer-Lucht Y, Sommer S: Evolution of MHC-DRB class II polymorphism in the genus Apodemus and a comparism of DRB sequences within the family Muridae (Mammalia: Rodentia). Immunogenetics. 2004, 56: 420-426. 10.1007/s00251-004-0715-9.
Horváth G, Wagner Z: Effect of densities of two coexistent small mammal populations on the survival of Apodemus flavicollis in a forest habitat. TISCIA (Szeged). 2003, 34: 41-46.
Miklós P, Žiak D: Microhabitat selection by three small mammal species in oak-elm forest. Folia Zool. 2002, 51 (4): 275-288.
Ferrari N, Cattadori IM, Nespereira J, Rizzoli A, Hudson PJ: The role of host sex in parasite dynamics: field experiments on the yellow-necked mouse Apodemus flavicollis. Ecol Lett. 2003, 6: 1-7. 10.1046/j.1461-0248.2003.00399.x.
Klimpel S, Förster M, Schmahl G: Parasites of two abundant sympatric rodent species in relation to host phylogeny and ecology. Parasitol Res. 2007, 100: 867-875. 10.1007/s00436-006-0368-8.
Nygard A-B, Jørgensen CB, Cirera S, Fredholm M: Selection of reference genes for gene expression studies in pig tissues using SYBR green qPCR. BMC Mol Biol. 2007, 8: 67-10.1186/1471-2199-8-67.
Bogaert L, Van Poucke M, De Baere C, Peelman L, Gasthuys F, Martens A: Selection of a set of reliable reference genes for quantitative real-time PCR in normal equine skin and in equine sarcoids. BMC Biotechnol. 2006, 6: 24-10.1186/1472-6750-6-24.
Tatsumi K, Ohashi K, Taminishi S, Okano T, Yoshioka M, Shima M: Reference gene selection for real-time RT-PCR in regenerating mouse livers. Biochem Biophys Res Commun. 2008, 374: 106-110. 10.1016/j.bbrc.2008.06.103.
Bustin SA, Nolan T: Pitfalls of Quantitative Real-Time Reverse- Transcription Polymerase Chain Reaction. J Biomol Tech. 2004, 15: 155-166.
Bustin SA: Quantification of mRNA using real-time reverse transcription PCR (RT-PCR): trends and problems. J Mol Endocrinol. 2002, 29 (1): 23-39. 10.1677/jme.0.0290023.
This study was funded by the Federal Ministry of Education and Research (BMBF). We thank A. Drews, state office for Agriculture, Environment and Rural Areas of Schleswig-Holstein and J. Stäcker, district forester for their support and permission to carry out this study.
The authors declare that they have no competing interests.
JA performed the sample collection, data acquisition, analysis and interpretation as well as drafting the manuscript. SS was responsible for the overall study design, supervised the study and helped to draft the manuscript. Both authors read and approved the final manuscript.