Statistical analysis of post mortem DNA damage-derived miscoding lesions in Neandertal mitochondrial DNA
© Vives et al; licensee BioMed Central Ltd. 2008
Received: 30 April 2008
Accepted: 10 July 2008
Published: 10 July 2008
We have analysed the distribution of post mortem DNA damage derived miscoding lesions from the datasets of seven published Neandertal specimens that have extensive cloned sequence coverage over the mitochondrial DNA (mtDNA) hypervariable region 1 (HVS1). The analysis was restricted to C→T and G→A miscoding lesions (the predominant manifestation of post mortem damage) that are seen at a frequency of more than one clone among sequences from a single PCR, but do not represent the true endogenous sequence.
The data indicates an extreme bias towards C→T over G→A miscoding lesions (observed ratio of 67:2 compared to an expected ratio of 7:2), implying that the mtDNA Light strand molecule suffers proportionally more damage-derived miscoding lesions than the Heavy strand.
The clustering of Cs in the Light strand as opposed to the singleton pattern of Cs in the Heavy strand could explain the observed bias, a phenomenon that could be further tested with non-PCR based approaches. The characterization of the HVS1 hotspots will be of use to future Neandertal mtDNA studies, with specific regards to assessing the authenticity of new positions previously unknown to be polymorphic.
The retrieval of DNA from extinct humans such as Neandertals is technically challenged by problems associated with post mortem damage of the original DNA . The growing availability of Neandertal mitochondrial DNA (mtDNA) hypervariable (HVS) sequences (predominantly HVS1), generated with the polymerase chain reaction (PCR) provides a novel dataset to study miscoding lesions associated to DNA damage.
The identification of true post mortem damage-derived miscoding lesions in ancient DNA studies, and their discrimination from other PCR artifacts, has been subject of much debate. Although the predominant cause was originally argued to be due to cytosine deamination, generating C→T and G→A miscoding lesions in the retrieved sequences [2, 3], a number of studies that examined additional datasets suggested that damage may also include adenine to hypoxanthine modifications, thus resulting in A→G and T→C miscoding lesions [4, 5]. The advent of 454/FLX sequencing technology, that allows the identification of which single DNA strand has been sequenced, has helped resolve this debate. In agreement with the original hypotheses [2, 3], it is now generally accepted that cytosine deamination is the sole cause of damage-derived miscoding lesions, observed as C→T or G→A miscoding lesions [6–9].
We have investigated the distribution of post mortem damage-derived C→T and G→A miscoding lesions in a dataset of Neanderthal HVS1 cloned PCR products. To discriminate between true damage and other PCR artifacts, we took into account only those mutations that are observed as 'consistent' within the datasets, i.e., those base modifications that are observed at a frequency >1 within sequences of a single PCR, but do not represent the consensus sequence as determined through the analysis of multiple independent PCRs of the region . We note that it cannot be assumed, that all the C→T and G→A changes are authentic miscoding lesions, and our analysis likely overestimates the true level as some C→T and G→A changes might be PCR-generated artifacts [9, 10].
To exclude other potential biases that might affect the findings, the analysis was furthermore limited to Neandertal sequences that contained the complete Neandertal motif for the amplicon. In this way we were able to exclude contaminant AMH sequences, Neandertal-AMH hybrid sequences, or other artifacts that might derive from jumping-PCR/PCR recombination. As a result of these criteria, the data represents a conservative estimate of the true damage. The goal of the present study is to characterize the different DNA miscoding lesions detected in Neandertals in relation to each specific strand and also to the nucleotide composition. We have also investigated whether the damage is randomly distributed along the HVS1 region, or if there are specific nucleotide positions (sites) that exhibit above expected levels of DNA mutations (termed here hotspots). If such miscoding lesion hotspots do exist in the Neandertal HVS1 region, then it would be useful to identify them for future Neandertal mtDNA studies, with specific regards to the authentication of new positions previously unknown to be polymorphic in Neandertals.
The cloned sequences from the HVS1 fragment of the mitochondrial DNA (mtDNA) of the seven Neandertal specimens that exist with extensive (>300 nucleotides) coverage were used in the analysis. These include: Feldhofer 1 and 2 from Germany [11, 12], Mezmaiskaya from Russia , Vindija 80 from Croatia , Monti Lessini from Italy , El Sidrón 1252 from Spain  and Okladnikov from Russia .
For all datasets the statistical analyses were performed on the cloned sequences between nucleotide positions 16056–16375, with reference to the Cambridge Reference Sequence (CRS) . To account for biases in the numbers of PCRs that the different datasets themselves, and different positions within each dataset, had undergone, the frequencies of the observed mutations were weighted by the number of the examined PCR at that position following . For full data see Additional files 1, 2 and 3.
Identification of hotspots
The identification of post mortem damage derived hotspots in previous studies [19, 20] was through statistical comparison of the actual observed distribution against that predicted under a hypothesis of random distribution. This approach was not taken in this study due to limitations on the current Neandertal dataset (the frequency of multiple mutations takes only values 0, 1 and 2, thus a simple test of goodness of fit to a Poisson distribution of the observed pattern of mutations can not be performed). Moreover, in the previous analyses the position of the mutation itself is not considered, which is desirable if the hotspots themselves are to be identified. We adopted an alternative statistical procedure that enabled us to identify specific sites of above-expected mutation rate.
Summary data including observed and expected number of consistent mutations observed over the discrete HVS1 region analysed considering a Neandertal consensus sequence.
Neandertal consensus HVS1
Results and Discussion
Consistent miscoding lesions observed among the dataset.
The fraction of the total C nucleotide positions that are observed to contain sequencing errors (63.55%) is much higher than those of A, G and T (9.09%, 6.45% and 6.94%, respectively). Of the cytosine mutations themselves, 98.5% represent C→T changes, while the only two consistent sequence modifications detected in positions containing G nucleotides are G→A changes. In light of current understandings of DNA damage, this observation of a heavy bias towards C damage is extremely odd. Due to the complementary nature of the DNA molecule, any C→T modification on a particular DNA strand within the double helix (say the mtDNA Light strand) will be manifested after PCR amplification and sequencing as either a C→T miscoding lesion on the descendent Light strand molecules, or as the complementary G→A miscoding event on the complementary strands (in this example the mtDNA Heavy strand) . In contrast, any C→T damage event on a Heavy strand molecule will lead to either a C→T modification on descendent Heavy strand molecules, or G→A mutations on descendent Light strand molecules. As C→T mutations form the only credible source of DNA damage-derived miscoding lesions [8, 9], a consequence of this argument is as follows. If C→T DNA damage occurs with equal probability on both Heavy and Light strand template molecules, at a frequency that is only dependent on the strands' base compositions, then the damage should be manifested as observations of both C→T and G→A sequence modifications within cloned Light strand descendent sequences, at a frequency dependent on the base composition. It is with this regard that the 7 Neandertal sequences appear striking – the observed ratio of C→T:G→A consistent sequence modifications is 67:2, a marked deviation from the approximate 3.5:1 that would be expected under the hypothesis of equal likelihood of DNA damage per different template strand (calculated as the ratio of cytosines on the Light strand:cytosines on the Heavy strand in Table 2). The implication therefore, is either that the Light strand molecule is subject to proportionally more damage-derived miscoding lesions than the Heavy strand molecule in the Neandertal datasets, or the 7 Neandertal datasets, all derived using different means by different researchers in different laboratories, all suffer from a common form of methodological bias or weakness.
In conclusion, the possibility of comparing Neandertal PCR-generated sequence data with future sequence data derived from alternative, non-PCR based approaches (such as 454 pyrosequencing or SPEX methodology) could generate more reliable sequence data for damage analysis and could help explain the bias observed here towards C→T over G→A miscoding lesions.
We are grateful to Adrian Briggs (Max Planck Institute, Leipzig) for helpful suggestions. This research has been supported by a grant (CGL2006-03987) from the Spanish Ministry of Education and Science to C.L.-F. and S.V. E.G has a PhD fellowship from the Spanish Ministry of Education and Science.
- Pääbo S, Poinar H, Serre D, Jaenicke-Despres V, Hebler J, Rohland N, Kuch M, Krause J, Vigilant L, Hofreiter M: Genetic analyses from ancient DNA. Annu Rev Genet. 2004, 38: 645-679. 10.1146/annurev.genet.37.110801.143214.View ArticlePubMedGoogle Scholar
- Pääbo S: Ancient DNA: extraction, characterization, molecular cloning, and enzymatic amplification. Proc Natl Acad Sci USA. 1989, 86 (6): 1939-1943. 10.1073/pnas.86.6.1939.PubMed CentralView ArticlePubMedGoogle Scholar
- Hofreiter M, Jaenicke V, Serre D, Haeseler Av A, Pääbo S: DNA sequences from multiple amplifications reveal artifacts induced by cytosine deamination in ancient DNA. Nucleic Acids Res. 2001, 29 (23): 4793-4799. 10.1093/nar/29.23.4793.PubMed CentralView ArticlePubMedGoogle Scholar
- Gilbert MT, Willerslev E, Hansen AJ, Barnes I, Rudbeck L, Lynnerup N, Cooper A: Distribution patterns of postmortem damage in human mitochondrial DNA. Am J Hum Genet. 2003, 72: 32-47. 10.1086/345378.PubMed CentralView ArticlePubMedGoogle Scholar
- Binladen J, Wiuf C, Gilbert MTP, Bunce M, Barnett R, Larson G, Greenwood AD, Haile J, Ho SY, Hansen AJ, Willerslev E: Assessing the fidelity of ancient DNA sequences amplified from nuclear genes. Genetics. 2006, 172 (2): 733-741. 10.1534/genetics.105.049718. Epub 2005 Nov 19PubMed CentralView ArticlePubMedGoogle Scholar
- Stiller M, Green RE, Ronan M, Simons JF, Du L, He W, Egholm M, Rothberg JM, Keates SG, Ovodov ND, Antipina EE, Baryshnikov GF, Kuzmin YV, Vasilevski AA, Wuenschell GE, Termini J, Hofreiter M, Jaenicke-Després V, Pääbo S: Patterns of nucleotide misincorporations during enzymatic amplification and direct large-scale sequencing of ancient DNA. Proc Natl Acad Sci USA. 2006, 103 (40): 13578-13584. 10.1073/pnas.0605327103.PubMed CentralView ArticlePubMedGoogle Scholar
- Gilbert MT, Binladen J, Miller W, Wiuf C, Willerslev E, Poinar H, Carlson JE, Leebens-Mack JH, Schuster SC: Recharacterization of ancient DNA miscoding lesions: insights in the era of sequencing-by-synthesis. Nucleic Acids Res. 2007, 35 (1): 1-10. 10.1093/nar/gkl483.PubMed CentralView ArticlePubMedGoogle Scholar
- Briggs AW, Stenzel U, Johnson PLF, Green RE, Kelso J, Prüfer K, Meyer M, Krause J, Ronan MT, Lachmann M, Pääbo S: Patterns of damage in genomic DNA sequences from a Neandertal. Proc Natl Acad Sci USA. 2007, 104: 14616-14621. 10.1073/pnas.0704665104.PubMed CentralView ArticlePubMedGoogle Scholar
- Brotherton P, Endicott P, Sanchez JJ, Beaumont M, Barnett R, Austin J, Cooper A: Novel high-resolution characterization of ancient DNA reveals C > U-type base modification events as the sole cause of post mortem miscoding lesions. Nucleic Acids Res. 2007, 35: 5717-5728. 10.1093/nar/gkm588.PubMed CentralView ArticlePubMedGoogle Scholar
- Pääbo S, Irwin DM, Wilson AC: DNA damage promotes jumping between templates during enzymatic amplification. J Biol Chem. 1990, 265 (8): 4718-4721.PubMedGoogle Scholar
- Krings M, Stone A, Schmitz R, Krainitzki H, Stoneking M, Pääbo S: Neanderthal DNA sequences and the origin of modern humans. Cell. 1997, 9: 19-30. 10.1016/S0092-8674(00)80310-4.View ArticleGoogle Scholar
- Schmitz RW, Serre D, Bonani G, Feine S, Hillgruber F, Krainitzki H, Pääbo S, Smith FH: The Neandertal type site revisited; interdisciplinary investigations of skeletal remains from the Neander Valley, Germany. Proc Natl Acad Sci USA. 2002, 99: 13342-13347. 10.1073/pnas.192464099.PubMed CentralView ArticlePubMedGoogle Scholar
- Ovchinnikov IV, Götherström A, Romanova GP, Kharitonov VM, Lidén K, Goodwin W: Molecular analysis of Neandertal DNA from the northern Caucasus. Nature. 2000, 404: 490-493. 10.1038/35006625.View ArticlePubMedGoogle Scholar
- Serre D, Langaney A, Chech M, Teschler-Nicola M, Paunovic M, Mennecier P, Hofreiter M, Possnert G, Pääbo S: No evidence of Neandertal mtDNA contribution to early modern humans. PLoS Biol. 2004, 2: E57-10.1371/journal.pbio.0020057.PubMed CentralView ArticlePubMedGoogle Scholar
- Caramelli D, Lalueza-Fox C, Condemi S, Longo L, Milani L, Manfredini A, de Saint Pierre M, Adoni F, Lari M, Giunti P, Ricci S, Casoli A, Calafell F, Mallegni F, Bertranpetit J, Stanyon R, Bertorelle G, Barbujani G: A highly divergent mtDNA sequence in a Neandertal individual from Italy. Curr Biol. 2006, 16: R630-R632. 10.1016/j.cub.2006.07.043.View ArticlePubMedGoogle Scholar
- Lalueza-Fox C, Krause J, Caramelli D, Catalano G, Milani L, Sampietro ML, Calafell F, Martínez-Maza C, Bastir M, García-Tabernero A, de la Rasilla M, Fortea J, Pääbo S, Bertranpetit J, Rosas A: Mitochondrial DNA of an Iberian Neandertal suggests a population affinity with other European Neandertals. Curr Biol. 2006, 16: R629-R630. 10.1016/j.cub.2006.07.044.View ArticlePubMedGoogle Scholar
- Krause J, Orlando L, Serre D, Viola B, Prüfer K, Richards MP, Hublin JJ, Hänni C, Derevianko AP, Pääbo S: Neandertals in Central Asia and Siberia. Nature. 2007, 444: 902-904. 10.1038/nature06193.View ArticleGoogle Scholar
- Anderson S, Bankier AT, Barrell BG, de Bruijn MH, Coulson AR, Drouin J, Eperon IC, Nierlich DP, Roe BA, Sanger F, Schreier PH, Smith AJ, Staden R, Young IG: Sequence and organization of the human mitochondrial genome. Nature. 1981, 290: 457-465. 10.1038/290457a0.View ArticlePubMedGoogle Scholar
- Gilbert MT, Hansen AJ, Willerslev E, Rudbeck L, Barnes I, Lynnerup N, Cooper A: Characterization of genetic miscoding lesions caused by postmortem damage. Am J Hum Genet. 2003, 72: 48-61. 10.1086/345379.PubMed CentralView ArticlePubMedGoogle Scholar
- Gilbert MTP, Shapiro BA, Drummond A, Cooper A: Post mortem DNA damage hotspots in Bison (Bison bison and B. bonasus) provide supporting evidence for mutational hotspots in human mitochondria. J Archaeol Sci. 2005, 32: 1053-1060. 10.1016/j.jas.2005.02.006.View ArticleGoogle Scholar
- Hansen A, Willerslev E, Wiuf C, Mourier T, Arctander P: Statistical evidence for miscoding lesions in ancient DNA templates. Mol Biol Evol. 2001, 18: 262-265.View ArticlePubMedGoogle Scholar
- Gilbert MTP: Post mortem damage of mitochondrial DNA. Human Mitochondrial DNA and the Evolution of Homo sapiens. Edited by: Bandelt HJ, Macaulay V, Richards M. 2006, Heidelberg: Springer-VerlagGoogle Scholar