Validation of phylogenetic signals in amplified fragment length data: testing the utility and reliability in closely related taxa
BMC Research Notes volume 2, Article number: 26 (2009)
Discriminating taxa with the nuclear marker, amplified fragment length polymorphism (AFLP) has been accomplished for various organisms in economic, ecological, and evolutionary studies. The protocol available for AFLP generation does not require prior knowledge of the genome; however, it is often extensively modified to fit the needs of the researcher. Modification of this protocol for new labs is intimidating and time-consuming, particularly for taxa in which AFLP have not been previously developed. Furthermore, determining what constitutes quality output during different stages of fragment generation is not well defined and this may further hinder the use AFLP by new researchers.
We present a step-by-step AFLP protocol, using flourophore-labeled primers for use with automated sequencers, including examples of both successful and unsuccessful results. We sufficiently normalized peak intensity and standardized allele calling across all samples for each primer combination. Repeatability was assessed with a phylogenetic tree in which replicate samples clustered together using the minimum evolution procedure. We found differences greater than 10% in allele position among replicated samples would cause replicates to no longer cluster. To minimize offset allele positions, we suggest that researchers analyze different primer combinations at the same time using multiple dyes with the automated sequencer to minimize mismatched alleles across replicates.
For researchers wanting to use AFLP, this molecular technique is difficult and time-consuming to develop. Clarifying what constitutes quality output for each step in AFLP generation will help to reduce redundant trials in protocol development and, in turn, advance the discipline of population genetics.
Amplified fragment length polymorphism (AFLP) has been extensively used to investigate population genetics [1, 2], genome mapping [3, 4], and genetic structure of intra- and interspecific taxa [5–7], especially in plants, microbes, and fungi, but less often for animal taxa . This method has many benefits over other genetic techniques for addressing questions in population genetics including: low start-up cost, high repeatability, the ability to assay a large number of polymorphic loci in many individuals in a relatively short period of time, and no prior knowledge of the genome or sequence data is necessary [8–14]. Using the original AFLP protocol [9, 15], an investigator generates between 50 – 100 restriction fragments, which are generally less than 600 base pairs (bp), per primer combination, when fragments are amplified and detected on denaturing polyacrylamide gels. Technological advancements (e.g., automated capillary sequencers and flourophore-labeled primers) have lead to reduced scorer bias, increased the overall number of fragments that may be scored with confidence, and promoted analyses of larger sample sizes .
There is a standard protocol available for AFLP generation  that has been modified by researchers for their specific study taxa . However, modification of this protocol  for new labs is intimidating and can be time-consuming to develop for taxa in which AFLP have not been previously established. Even after successful AFLP fragment generation, determining what constitutes quality AFLP output (i.e., electropherograms) is unclear. Although software is available for analyzing electropherograms generated from automated capillary sequencers (e.g., GeneMapper or GeneMarker), there is a lack of clearly refined protocols to assess the quality of generated AFLP fragments. Thus, understanding and interpretation of successful results can be challenging to researchers first using AFLP and redundant trials in protocol development can hinder advancement of research. Because of these challenges, the objectives of this study were to: (1) provide guidelines for AFLP generation with automated analyses using a protocol amenable for disparate animal groups; (2) construct procedures to normalize and standardize AFLP electropherograms; and (3) test the repeatability of AFLP samples processed with an automated capillary sequencer and analyzed with applicable software.
Snails (Elimia) and salamanders (Desmognathus) were used to evaluate the standardized AFLP protocol. Two different DNA-extraction techniques were employed to achieve high-quality, whole genomic DNA. The DNA was considered high quality if there was an optical density (OD) of 260/230 between 1.8 – 2.1 and 260/280 between 1.8 – 1.9 using a spectrophotometer (NanoDrop ND – 1000). Each sample was visually inspected on 1% sodium borate agarose gels (Figure 1a).
For snails, whole genomic DNA was extracted from head tissue using a modified protocol . Each snail head was placed in a 600 μl CTAB solution (2% CTAB, 1.4 M NaCl, 20 mM EDTA, 100 mM Tris – HCl pH 8, 0.2% β-mercaptoethanol) to which 15 μl of 10 mg/ml Proteinase K was added followed by incubation at 55°C for 3 h. Each sample was washed twice with 600 μl of chloroform:isoamyl alcohol (24:1) and allowed to precipitate overnight in cold isopropyl alcohol. Precipitated DNA was purified with 95% ethanol and a final 70% ethanol wash. Samples contained a strong band of whole genomic DNA, but also contained degraded DNA and RNA that negatively affect the quality of downstream reactions . Thus, all samples were further purified using a QIAEX II Gel Extraction Kit (Qiagen, Calencia, CA). This procedure yielded between 5 – 75 ng of high-quality DNA.
For salamanders, whole genomic DNA was extracted from approximately 5 mm of tail tissue from each individual using a DNeasy Kit protocol for animal tissues (Qiagen, Valencia, CA). This protocol yielded between 5 – 100 ng of high-quality, whole genomic DNA.
Amplified Fragment Length Polymorphism and Primer Screening Protocol
All enzymes and restriction buffers were obtained from New England Biolabs, unless otherwise noted. Digestion reactions in 20 μl volumes were performed on whole genomic DNA with a concentration ranging from 10 – 70 ng/μl following cocktail per reaction: 2 μl Eco RI 10× restriction buffer, 1.0 μl Eco RI (20,000 U/ml), 0.2 μl Mse I (10,000 U/ml; in snails, 0.8 μl Mse I was used), and 12 μl H2O. Following a 5 h incubation period at 37°C (in snails, 3 h); we ensured adequate digestion using a 1% sodium borate agarose gel (Figure 1b). Before ligation, double-stranded adaptor pairs (10 mM) were constructed from the complementary single-stranded oligonucleotides (Table 1). These adaptor pairs were joined by combining 250 μl of each adapter and heating the solution at 95°C for 5 min and then cooled to 25°C (Table 2). Ligation proceeded using a 20 μl mixture consisting of 12 μl H2O, 4 μl 10× T4 DNA ligase buffer, 1.5 μl of the Eco RI and Mse I adapter pairs (75 pmols), and 1.0 μl of T4 DNA ligase to each digestion solution. These samples were incubated for 10 h (in snails, for 12 h) at 16°C. Following ligation, each sample was diluted with 160 μl H2O.
We conducted a preselective primer screen to determine the efficacy of primer pairs. In the standardized protocol , only one base is added during preselective PCR. We tested Eco RI + NN, where N represents the number of additional base pairs attached to the core sequence (i.e., NN = two additional base pairs; Table 1), and Mse I + NN primer combinations (n = 256) by checking for high-quality bands on a 1% sodium borate agarose gel, since this was the most cost-effective means for determining primer efficacy. Thus, we used this two base extension for preselective PCR. Using the chosen primer combinations, we completed preselective PCR involving using the cocktail in Table 2 and the cycling conditions in Table 3. Preselective solutions were diluted with 125 μl and 160 μl of H2O in salamanders and snails, respectively. A 1% sodium borate agarose gel was used to visually check each sample (Figure 1c).
We ran a subsequent selective primer screen to determine which Eco RI + NNN and Mse I + NN would generate highest quality bands on a 1% sodium borate agarose gel. When labeling Eco RI with a flourophore (i.e., 6-FAM) for selective PCR, the label should always be attached to the 5' end and to any nucleotide except guanine because of the effects of guanine quenching . Using the chosen primer combinations, we completed selective PCR using the cocktail in Table 4 and the cycling conditions in Table 5. A 1% sodium borate agarose gel was used to visually check each sample (Figure 1d). In general, if samples were visible on the 1% sodium borate agarose gel they are too strong for the autosequencer; thus, all samples were diluted between 25 – 50% with doubly-distilled water.
Diluted selective amplification products were purified using fine G-50 Sephadex (Sigma-Aldrich Corp.). We loaded 1.5 μl of purified product per sample along with 0.5 μl GeneScan-500 ROX ladder (PerkinElmer, Inc.) into 96-well plates. Samples were analyzed using an automated ABI 3100 DNA sequencer (Applied Biosystems) and electropherograms were imported into GeneMarker v. 1.6 (SoftGenetics, LLC.) for analyses.
Using GeneMarker v. 1.6, intensity of peaks in the raw data were normalized, without application of the size standard, by generating a template with the AFLP signal processed between 1200 and 11,500 relative fluorescence units (rfu). For the raw data analysis, local southern size-call algorithm, peak saturation, baseline subtraction, pull-up correction, and spike removal correction were selected (Figure 2). Following normalization, allele call was performed with application of the size standard.
For allele-called data, only electropherograms in which the size standard used in the analysis matched a theoretical standard by 90% or greater were included for further analysis. A peak was considered an allele if peak intensity was between 100 – 8000 rfu and peaks were longer than 60 bp, which was the shortest fragment length in which clearly defined peaks appeared (Figure 3). However, other researchers have suggested that fragments should only be considered alleles with minimum lengths of 75 bp  and 125 bp  due to an increased probability of homoplasy. Allele-called data was standardized across individuals for each primer combination by creating a unique standardizing panel (i.e., panel editor in GeneMarker v. 1.6). To generate a standardizing panel, we chose 10 individuals from discrete populations that exhibited the greatest polymorphism (i.e., greatest number and largest range of peaks). Allele positions within templates were further standardized by setting the range around an allele as ± 0.4 bp (i.e., bin size in GeneMarker v. 1.6). Thus, two fragments that fell within this range would be considered one fragment. Creation of a standardizing template for each primer combination ensured that peak position (a peak is equivalent to an allele) was precise for all electropherograms.
After normalization, a binary matrix was generated for each primer combination. An allele was denoted as "1" if the peak intensity was greater than 100 rfu and the fragment occurred between 60 and 350 bp. An allele was considered as "0" if these conditions were not met. Any questionable alleles were manually checked and scored accordingly. All matrices were combined to form one binary matrix for further analyses.
To test repeatability of replicated samples within and among 96-well plates, PAUP* v. 4.0b10  was used to generate phylogenetic hypotheses using the minimum-evolution procedure (total character and Nei-Li  distance options were selected). For each primer combination, replicated samples were placed within a binary matrix with an equal number of non-replicated samples. Repeatability was achieved if replicate samples were most closely related to each other.
Results and discussion
We generated AFLP fragments for two unrelated taxa using a modified AFLP protocol of Vos et al.  and Berres et al. . Our data support findings in the literature that the quality of DNA is more important than the initial concentration  as we successfully recovered alleles of samples that varied in DNA concentration from 5 to 70 ng/μl. It is necessary to visually check (i.e., on a gel) the quality of the DNA to ensure that minimal degradation has occurred if samples are stored at -20°C for more than a few weeks in a 0.5 μl PCR tube.
During various stages of AFLP generation, DNA concentration may be too high and, if left undiluted, the automated sequencer will become saturated. This will result in peaks that are squared-off at the apex and do not reach the baseline (Figure 4d). In order to prevent saturation, gel images are essential to determine the level of dilution, if any, for each sample; we found that the dilution amount varied between our taxa. We found that after preselective PCR for salamanders, samples were diluted to one part PCR product to three parts water, and for snails, samples were diluted to one part PCR product to five parts water. For other unrelated taxa, researchers may need to further modify concentrations to achieve optimal datasets as shown in Figures 2 and 3.
Some samples meeting all methodological requirements still yielded poor quality electropherograms and were excluded from analyses (Figures 4 and 5). Generally, we found that normalization of raw data using techniques such as pull-up correction, baseline subtraction, peak saturation, smoothing, spike removal, and local southern size call were sufficient to normalize peak intensity across all electropherograms. If normalization is not successful, peaks will extend either below the size standard (Figure 4a) or above the saturation point of 8000 rfu (Figure 4d). If this occurs, even after implementation of the normalization procedure, the researcher must repeat PCR stages for any failed samples.
After raw data were successfully normalized (Figure 2), electropherograms were standardized using the panel editor. This allowed for precise calling of alleles across all samples for each primer combination (Figure 5). Generally, most samples conformed to successful allele calling; however, it is essential that each sample is manually checked to ensure alleles are properly called. Regardless of the quality of the electropherogram, there will be peaks that cannot confidently called by the software (denoted by a '?' in GeneMarker v. 1.6). In this case, each questionable peak must be manually checked and subsequently scored.
Repeatability is essential for the construction of phylogenetic hypotheses using AFLP fragments . The use of a few replicated samples from a 96-well plate allows for cost-effective and reliable tests of repeatability. We were able to generate lineages that contain replicate samples using the minimum evolution procedure. Nodes containing only individual replicate samples were formed in every case for all primer combinations (results not shown). In addition, we calculated a 10% threshold value, which we define as the percentage of allelic differences between replicate samples, in which replicate samples did not form distinct lineages. This threshold value is markedly greater than expected. Although not used in this study, researchers may want to consider the use of the multiple dyes feature available with automated capillary systems. Because this methodology allows for the simultaneous analysis of several primer combinations, it is a cost-effective, efficient means to generate a large number of AFLP fragments while reducing scoring errors that may be associated with batch effects.
Nuclear markers generated by the cost-effective AFLP technique are broadly used in genetic studies across many taxa regardless of the size or complexity of the genome. We presented a modified protocol that generates AFLP fragments in snails and salamanders, with options for refining this technique to suit the needs of most researchers. Refinement of existing AFLP protocols was essential to facilitate and encourage the broader use of AFLP fragments in genetic studies in animal taxa. Our protocol provides a starting point for researchers to use AFLP, in studies of natural animal population regardless of their taxonomic status and genomic complexity.
Hardy OJ: Estimation of pairwise relatedness between individuals and characterization of isolation-by-distance processes using dominant genetic markers. Mol Ecol. 2003, 12: 1577-1588. 10.1046/j.1365-294X.2003.01835.x.
Mendelson TC, Simons JN: AFLPs resolve cytonuclear discordance and increase resolution among barcheek darters (Percidae: Etheostoma: Catonotus). Mol Phylogenet Evol. 2006, 41: 445-453. 10.1016/j.ympev.2006.05.010.
Knorr C, Cheng HH, Dodgson JB: Application of AFLP markers to genome mapping in poultry. Anim Genet. 1999, 30: 28-35. 10.1046/j.1365-2052.1999.00411.x.
Hoarau JY, Offmann B, D'Hont A, Risterucci AM, Roques D, Glaszmann JC, Grivet L: Genetic dissection of modern sugarcane cultivar (Savvharum spp.). I. Genome mapping with AFLP markers. Theor Appl Genet. 2001, 103: 84-97. 10.1007/s001220000390.
Zhivotovsky LA: Estimating population structure in diploids with multilocus dominant DNA markers. Mol Ecol. 1999, 8: 907-913. 10.1046/j.1365-294x.1999.00620.x.
Carisio L, Cervella P, Palestrini C, DelPero M, Rolando A: Biogeographical patterns of genetic differentiation in dung beetles of the genus Trypocopris (Coleoptera, Geotrupidae) inferred from mtDNA and AFLP analysis. J Biogeogr. 2004, 31: 1149-1162. 10.1111/j.1365-2699.2004.01074.x.
Albach DC, Schönswetter P, Tribsch A: Comparative phylogeography of the Veronica alpine complex in Europe and North America. Mol Ecol. 2006, 15: 3269-3286. 10.1111/j.1365-294X.2006.02980.x.
Bensch S, Åkesson M: Ten years of AFLP in ecology and evolution: why so few animals?. Mol Ecol. 2005, 14: 2899-2914. 10.1111/j.1365-294X.2005.02655.x.
Vos P, Hogers R, Bleeker M, Reijans M, Lee van de T, Hornes M, Frijters A, Pot J, Peleman J, Kuiper M, Zabeau M: AFLP: a new technique for DNA fingerprinting. Nucleic Acids Res. 1995, 23: 4407-4414. 10.1093/nar/23.21.4407.
Ogden R, Thorpe RS: The usefulness of amplified fragment length polymorphism markers for taxon discrimination across graduated fine evolutionary levels in Caribbean Anolis lizards. Mol Ecol. 2002, 11: 437-445. 10.1046/j.0962-1083.2001.01442.x.
Blears MJ, De Grandis SA, Lee H, Trevors JT: Amplified fragment length polymorphism (AFLP): a review of the procedure and its applications. J of Industrial Microbiology and Biotechnology. 1998, 21: 99-114. 10.1038/sj.jim.2900537.
Mueller UG, Wolfenbarger LL: AFLP genotyping and fingerprinting. Trends in Ecology and Evolution. 1999, 14: 389-394. 10.1016/S0169-5347(99)01659-6.
Meudt HM, Clarke AC: Almost forgotten or latest practice? AFLF applications, analyses and advances. Trends in Plant Science. 2007, 12: 106-117. 10.1016/j.tplants.2007.02.001.
Falush D, Stephens M, Pritchard JK: Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics. 2003, 164: 1567-1587.
Berres ME, Engels WR, Kirsch JAW: A method for genotyping ostensibly dominant markers in AFLP fingerprints. Genetics. in review
Bonin A, Bellemain E, Eidesen Bronken P, Pompanon F, Brockmann C, Taberlet P: How to track and assess genotyping errors in population genetics studies. Mol Ecol. 2004, 13: 3261-3273. 10.1111/j.1365-294X.2004.02346.x.
Winnepenninckx B, Backeljau T, Dewachter R: Extraction of high-molecular-weight DNA from mollusks. Trends Genet. 1993, 9: 407-10.1016/0168-9525(93)90102-N.
Behlke MA, Huang L, Bogh L, Rose S, Devor EJ: Fluorescence and Fluorescence Applications. Integrated DNA Technologies. 2005, 1-13. [http://www.idtdna.com]
Vekemans X, Beauwens T, Lemaire M, Roldan-Ruiz I: Data from amplified fragment length polymorphism (AFLP) markers show indication of size homoplasy and of a relationship between degree of homoplasy and fragment size. Mol Ecol. 2002, 11: 139-151. 10.1046/j.0962-1083.2001.01415.x.
Althoff DM, Gitzendanner MA, Segraves KA: The utility of amplified fragment length polymorphisms in phylogenetics: A comparison of homology within and between genomes. Syst Biol. 56: 477-484. 10.1080/10635150701427077.
Swofford D: PAUP*: Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 4.0b10. Sinauer Associates, Sunderland, Massachusetts, [http://www.sinauer.com/detail.php?id=8060]
Nei M, Li WH: Mathematical model for studying genetic variation in terms of restriction endonucleases. Proc Natl Acad Sci USA. 1979, 76: 5269-5273. 10.1073/pnas.76.10.5269.
We dedicate this work and article to Wally Holznagel, retired director of the Johnson Molecular Systematics Laboratory at The University of Alabama, Tuscaloosa, Alabama. His continuous support and many helpful discussions and ideas were used in the formulation of the protocol and construction of this manuscript. We want to extend a sincere appreciation to the three anonymous reviewers who provided very helpful comments that greatly improved this manuscript. We thank Leslie Rissler for funding and she provided many helpful discussions and support during this research. Kevin LeVan, Jonathan Liu, and David Hulce from GeneMarker® were instrumental in the analysis by providing technical advice and support. All electopherograms were generated at the DNA sequencing facility at the University of Pennsylvania and a special thanks to Erik Toorens for his timely attention to our samples and questions. Additional thanks to Phil Harris, Carlos Camp, and Robert Makowsky for their support, advice, and discussions about previous versions of the manuscript. All salamander research was approved by the Institutional Animal Care and Use Committee (IACUC) protocol number 05-242-3 to LJR at The University of Alabama. This research was funded by: American Museum of Natural History grant awarded to JAW, Conservation Grant from the North American Benthological Society awarded to LTJ, NSF DEB 0414033 awarded to LJR, NSF IGERT (DGE-9972810) awarded to Amelia Ward, and The University of Alabama.
The authors declare that they have no competing interests.
JAW and LTJ contributed equally to all parts of methodological development and manuscript preparation. Both authors read and approved the final manuscript.
About this article
Cite this article
Wooten, J.A., Tolley-Jordan, L.R. Validation of phylogenetic signals in amplified fragment length data: testing the utility and reliability in closely related taxa. BMC Res Notes 2, 26 (2009). https://doi.org/10.1186/1756-0500-2-26
- Amplify Fragment Length Polymorphism
- Prime Combination
- Amplify Fragment Length Polymorphism Fragment
- Amplify Fragment Length Polymorphism Technique
- Automate Capillary Sequencer