Skip to main content

Validation of phylogenetic signals in amplified fragment length data: testing the utility and reliability in closely related taxa

Abstract

Background

Discriminating taxa with the nuclear marker, amplified fragment length polymorphism (AFLP) has been accomplished for various organisms in economic, ecological, and evolutionary studies. The protocol available for AFLP generation does not require prior knowledge of the genome; however, it is often extensively modified to fit the needs of the researcher. Modification of this protocol for new labs is intimidating and time-consuming, particularly for taxa in which AFLP have not been previously developed. Furthermore, determining what constitutes quality output during different stages of fragment generation is not well defined and this may further hinder the use AFLP by new researchers.

Findings

We present a step-by-step AFLP protocol, using flourophore-labeled primers for use with automated sequencers, including examples of both successful and unsuccessful results. We sufficiently normalized peak intensity and standardized allele calling across all samples for each primer combination. Repeatability was assessed with a phylogenetic tree in which replicate samples clustered together using the minimum evolution procedure. We found differences greater than 10% in allele position among replicated samples would cause replicates to no longer cluster. To minimize offset allele positions, we suggest that researchers analyze different primer combinations at the same time using multiple dyes with the automated sequencer to minimize mismatched alleles across replicates.

Conclusion

For researchers wanting to use AFLP, this molecular technique is difficult and time-consuming to develop. Clarifying what constitutes quality output for each step in AFLP generation will help to reduce redundant trials in protocol development and, in turn, advance the discipline of population genetics.

Background

Amplified fragment length polymorphism (AFLP) has been extensively used to investigate population genetics [1, 2], genome mapping [3, 4], and genetic structure of intra- and interspecific taxa [5–7], especially in plants, microbes, and fungi, but less often for animal taxa [8]. This method has many benefits over other genetic techniques for addressing questions in population genetics including: low start-up cost, high repeatability, the ability to assay a large number of polymorphic loci in many individuals in a relatively short period of time, and no prior knowledge of the genome or sequence data is necessary [8–14]. Using the original AFLP protocol [9, 15], an investigator generates between 50 – 100 restriction fragments, which are generally less than 600 base pairs (bp), per primer combination, when fragments are amplified and detected on denaturing polyacrylamide gels. Technological advancements (e.g., automated capillary sequencers and flourophore-labeled primers) have lead to reduced scorer bias, increased the overall number of fragments that may be scored with confidence, and promoted analyses of larger sample sizes [16].

There is a standard protocol available for AFLP generation [9] that has been modified by researchers for their specific study taxa [8]. However, modification of this protocol [9] for new labs is intimidating and can be time-consuming to develop for taxa in which AFLP have not been previously established. Even after successful AFLP fragment generation, determining what constitutes quality AFLP output (i.e., electropherograms) is unclear. Although software is available for analyzing electropherograms generated from automated capillary sequencers (e.g., GeneMapper or GeneMarker), there is a lack of clearly refined protocols to assess the quality of generated AFLP fragments. Thus, understanding and interpretation of successful results can be challenging to researchers first using AFLP and redundant trials in protocol development can hinder advancement of research. Because of these challenges, the objectives of this study were to: (1) provide guidelines for AFLP generation with automated analyses using a protocol amenable for disparate animal groups; (2) construct procedures to normalize and standardize AFLP electropherograms; and (3) test the repeatability of AFLP samples processed with an automated capillary sequencer and analyzed with applicable software.

Study Animals

Snails (Elimia) and salamanders (Desmognathus) were used to evaluate the standardized AFLP protocol. Two different DNA-extraction techniques were employed to achieve high-quality, whole genomic DNA. The DNA was considered high quality if there was an optical density (OD) of 260/230 between 1.8 – 2.1 and 260/280 between 1.8 – 1.9 using a spectrophotometer (NanoDrop ND – 1000). Each sample was visually inspected on 1% sodium borate agarose gels (Figure 1a).

Figure 1
figure 1

Step-by-step protocol for AFLP generation. Schematic representation of each step in the AFLP protocol represented on 1% sodium borate agarose gels. All gel images were generated from undiluted DNA solutions. Each gel image contains lanes that represent acceptable products as indicated by a unique symbol. The 100 bp ladder is denoted by # in all gel images.

For snails, whole genomic DNA was extracted from head tissue using a modified protocol [17]. Each snail head was placed in a 600 μl CTAB solution (2% CTAB, 1.4 M NaCl, 20 mM EDTA, 100 mM Tris – HCl pH 8, 0.2% β-mercaptoethanol) to which 15 μl of 10 mg/ml Proteinase K was added followed by incubation at 55°C for 3 h. Each sample was washed twice with 600 μl of chloroform:isoamyl alcohol (24:1) and allowed to precipitate overnight in cold isopropyl alcohol. Precipitated DNA was purified with 95% ethanol and a final 70% ethanol wash. Samples contained a strong band of whole genomic DNA, but also contained degraded DNA and RNA that negatively affect the quality of downstream reactions [15]. Thus, all samples were further purified using a QIAEX II Gel Extraction Kit (Qiagen, Calencia, CA). This procedure yielded between 5 – 75 ng of high-quality DNA.

For salamanders, whole genomic DNA was extracted from approximately 5 mm of tail tissue from each individual using a DNeasy Kit protocol for animal tissues (Qiagen, Valencia, CA). This protocol yielded between 5 – 100 ng of high-quality, whole genomic DNA.

Amplified Fragment Length Polymorphism and Primer Screening Protocol

All enzymes and restriction buffers were obtained from New England Biolabs, unless otherwise noted. Digestion reactions in 20 μl volumes were performed on whole genomic DNA with a concentration ranging from 10 – 70 ng/μl following cocktail per reaction: 2 μl Eco RI 10× restriction buffer, 1.0 μl Eco RI (20,000 U/ml), 0.2 μl Mse I (10,000 U/ml; in snails, 0.8 μl Mse I was used), and 12 μl H2O. Following a 5 h incubation period at 37°C (in snails, 3 h); we ensured adequate digestion using a 1% sodium borate agarose gel (Figure 1b). Before ligation, double-stranded adaptor pairs (10 mM) were constructed from the complementary single-stranded oligonucleotides (Table 1). These adaptor pairs were joined by combining 250 μl of each adapter and heating the solution at 95°C for 5 min and then cooled to 25°C (Table 2). Ligation proceeded using a 20 μl mixture consisting of 12 μl H2O, 4 μl 10× T4 DNA ligase buffer, 1.5 μl of the Eco RI and Mse I adapter pairs (75 pmols), and 1.0 μl of T4 DNA ligase to each digestion solution. These samples were incubated for 10 h (in snails, for 12 h) at 16°C. Following ligation, each sample was diluted with 160 μl H2O.

Table 1 Adapter and primer sequences used in AFLP.
Table 2 Cocktails used for preselective PCR in AFLP.

We conducted a preselective primer screen to determine the efficacy of primer pairs. In the standardized protocol [9], only one base is added during preselective PCR. We tested Eco RI + NN, where N represents the number of additional base pairs attached to the core sequence (i.e., NN = two additional base pairs; Table 1), and Mse I + NN primer combinations (n = 256) by checking for high-quality bands on a 1% sodium borate agarose gel, since this was the most cost-effective means for determining primer efficacy. Thus, we used this two base extension for preselective PCR. Using the chosen primer combinations, we completed preselective PCR involving using the cocktail in Table 2 and the cycling conditions in Table 3. Preselective solutions were diluted with 125 μl and 160 μl of H2O in salamanders and snails, respectively. A 1% sodium borate agarose gel was used to visually check each sample (Figure 1c).

Table 3 Thermocycler conditions for preselective PCR.

We ran a subsequent selective primer screen to determine which Eco RI + NNN and Mse I + NN would generate highest quality bands on a 1% sodium borate agarose gel. When labeling Eco RI with a flourophore (i.e., 6-FAM) for selective PCR, the label should always be attached to the 5' end and to any nucleotide except guanine because of the effects of guanine quenching [18]. Using the chosen primer combinations, we completed selective PCR using the cocktail in Table 4 and the cycling conditions in Table 5. A 1% sodium borate agarose gel was used to visually check each sample (Figure 1d). In general, if samples were visible on the 1% sodium borate agarose gel they are too strong for the autosequencer; thus, all samples were diluted between 25 – 50% with doubly-distilled water.

Table 4 Cocktails used for selective PCR in AFLP.
Table 5 Thermocycler conditions for selective PCR.

Diluted selective amplification products were purified using fine G-50 Sephadex (Sigma-Aldrich Corp.). We loaded 1.5 μl of purified product per sample along with 0.5 μl GeneScan-500 ROX ladder (PerkinElmer, Inc.) into 96-well plates. Samples were analyzed using an automated ABI 3100 DNA sequencer (Applied Biosystems) and electropherograms were imported into GeneMarker v. 1.6 (SoftGenetics, LLC.) for analyses.

Fragment Analysis

Using GeneMarker v. 1.6, intensity of peaks in the raw data were normalized, without application of the size standard, by generating a template with the AFLP signal processed between 1200 and 11,500 relative fluorescence units (rfu). For the raw data analysis, local southern size-call algorithm, peak saturation, baseline subtraction, pull-up correction, and spike removal correction were selected (Figure 2). Following normalization, allele call was performed with application of the size standard.

Figure 2
figure 2

An example of a raw AFLP data electropherogram. The y-axis is intensity of the peak measured in relative fluorescence units (rfu) and the x-axis is clicks of the ABI 3100 detector in frames. Blue peaks are the AFLP fragment data and the red peaks are the ROX size standard.

For allele-called data, only electropherograms in which the size standard used in the analysis matched a theoretical standard by 90% or greater were included for further analysis. A peak was considered an allele if peak intensity was between 100 – 8000 rfu and peaks were longer than 60 bp, which was the shortest fragment length in which clearly defined peaks appeared (Figure 3). However, other researchers have suggested that fragments should only be considered alleles with minimum lengths of 75 bp [19] and 125 bp [20] due to an increased probability of homoplasy. Allele-called data was standardized across individuals for each primer combination by creating a unique standardizing panel (i.e., panel editor in GeneMarker v. 1.6). To generate a standardizing panel, we chose 10 individuals from discrete populations that exhibited the greatest polymorphism (i.e., greatest number and largest range of peaks). Allele positions within templates were further standardized by setting the range around an allele as ± 0.4 bp (i.e., bin size in GeneMarker v. 1.6). Thus, two fragments that fell within this range would be considered one fragment. Creation of a standardizing template for each primer combination ensured that peak position (a peak is equivalent to an allele) was precise for all electropherograms.

Figure 3
figure 3

An example of allele-call data. The y-axis is intensity of the peak measured in relative fluorescence units (rfu) and the x-axis is the number of base pairs (bp). The gray lines indicate which peaks were scored as present (1) in GeneMarker v. 1.6. Blue peaks are the AFLP data and the red peaks are the ROX size standard.

After normalization, a binary matrix was generated for each primer combination. An allele was denoted as "1" if the peak intensity was greater than 100 rfu and the fragment occurred between 60 and 350 bp. An allele was considered as "0" if these conditions were not met. Any questionable alleles were manually checked and scored accordingly. All matrices were combined to form one binary matrix for further analyses.

Repeatability

To test repeatability of replicated samples within and among 96-well plates, PAUP* v. 4.0b10 [21] was used to generate phylogenetic hypotheses using the minimum-evolution procedure (total character and Nei-Li [22] distance options were selected). For each primer combination, replicated samples were placed within a binary matrix with an equal number of non-replicated samples. Repeatability was achieved if replicate samples were most closely related to each other.

Results and discussion

We generated AFLP fragments for two unrelated taxa using a modified AFLP protocol of Vos et al. [9] and Berres et al. [15]. Our data support findings in the literature that the quality of DNA is more important than the initial concentration [11] as we successfully recovered alleles of samples that varied in DNA concentration from 5 to 70 ng/μl. It is necessary to visually check (i.e., on a gel) the quality of the DNA to ensure that minimal degradation has occurred if samples are stored at -20°C for more than a few weeks in a 0.5 μl PCR tube.

During various stages of AFLP generation, DNA concentration may be too high and, if left undiluted, the automated sequencer will become saturated. This will result in peaks that are squared-off at the apex and do not reach the baseline (Figure 4d). In order to prevent saturation, gel images are essential to determine the level of dilution, if any, for each sample; we found that the dilution amount varied between our taxa. We found that after preselective PCR for salamanders, samples were diluted to one part PCR product to three parts water, and for snails, samples were diluted to one part PCR product to five parts water. For other unrelated taxa, researchers may need to further modify concentrations to achieve optimal datasets as shown in Figures 2 and 3.

Figure 4
figure 4

Examples of poor quality raw AFLP electropherograms. In most cases, the only solution is to run PCR again. A. An example of the AFLP electropherogram in which the AFLP data (as indicated by the blue line) is too low and is located below the size standard (the red line). The peak intensity is too low and does not permit confident scoring. B. In this example, the peak intensity forms a hill and should not be scored. C. During this run, as can be seen by the electropherogram, the analysis stopped working around 6000 frames. D. The peak intensity is too high and has saturated the ABI 3100. The saturation point for the ABI 3100 is 8000 relative fluorescence units (rfu). The peaks that are squared off at the 8000 rfu point cannot be confidently scored.

Some samples meeting all methodological requirements still yielded poor quality electropherograms and were excluded from analyses (Figures 4 and 5). Generally, we found that normalization of raw data using techniques such as pull-up correction, baseline subtraction, peak saturation, smoothing, spike removal, and local southern size call were sufficient to normalize peak intensity across all electropherograms. If normalization is not successful, peaks will extend either below the size standard (Figure 4a) or above the saturation point of 8000 rfu (Figure 4d). If this occurs, even after implementation of the normalization procedure, the researcher must repeat PCR stages for any failed samples.

Figure 5
figure 5

Examples of poor quality allele call AFLP electropherograms. In most cases, the only solution is to run PCR again. A. Some alleles identified as peaks can be called in this example. However, many alleles will go undetected because of the structure of the electropherogram. B. The ABI 3100 was saturated in the beginning of the run and the remaining portion of the run is not complete. C. The reaction stopped working and many alleles will not be automatically called by the allele-calling software. D. The peak intensity is too low for many alleles to be called as present.

After raw data were successfully normalized (Figure 2), electropherograms were standardized using the panel editor. This allowed for precise calling of alleles across all samples for each primer combination (Figure 5). Generally, most samples conformed to successful allele calling; however, it is essential that each sample is manually checked to ensure alleles are properly called. Regardless of the quality of the electropherogram, there will be peaks that cannot confidently called by the software (denoted by a '?' in GeneMarker v. 1.6). In this case, each questionable peak must be manually checked and subsequently scored.

Repeatability is essential for the construction of phylogenetic hypotheses using AFLP fragments [8]. The use of a few replicated samples from a 96-well plate allows for cost-effective and reliable tests of repeatability. We were able to generate lineages that contain replicate samples using the minimum evolution procedure. Nodes containing only individual replicate samples were formed in every case for all primer combinations (results not shown). In addition, we calculated a 10% threshold value, which we define as the percentage of allelic differences between replicate samples, in which replicate samples did not form distinct lineages. This threshold value is markedly greater than expected. Although not used in this study, researchers may want to consider the use of the multiple dyes feature available with automated capillary systems. Because this methodology allows for the simultaneous analysis of several primer combinations, it is a cost-effective, efficient means to generate a large number of AFLP fragments while reducing scoring errors that may be associated with batch effects.

Conclusion

Nuclear markers generated by the cost-effective AFLP technique are broadly used in genetic studies across many taxa regardless of the size or complexity of the genome. We presented a modified protocol that generates AFLP fragments in snails and salamanders, with options for refining this technique to suit the needs of most researchers. Refinement of existing AFLP protocols was essential to facilitate and encourage the broader use of AFLP fragments in genetic studies in animal taxa. Our protocol provides a starting point for researchers to use AFLP, in studies of natural animal population regardless of their taxonomic status and genomic complexity.

References

  1. Hardy OJ: Estimation of pairwise relatedness between individuals and characterization of isolation-by-distance processes using dominant genetic markers. Mol Ecol. 2003, 12: 1577-1588. 10.1046/j.1365-294X.2003.01835.x.

    Article  PubMed  Google Scholar 

  2. Mendelson TC, Simons JN: AFLPs resolve cytonuclear discordance and increase resolution among barcheek darters (Percidae: Etheostoma: Catonotus). Mol Phylogenet Evol. 2006, 41: 445-453. 10.1016/j.ympev.2006.05.010.

    Article  CAS  PubMed  Google Scholar 

  3. Knorr C, Cheng HH, Dodgson JB: Application of AFLP markers to genome mapping in poultry. Anim Genet. 1999, 30: 28-35. 10.1046/j.1365-2052.1999.00411.x.

    Article  CAS  PubMed  Google Scholar 

  4. Hoarau JY, Offmann B, D'Hont A, Risterucci AM, Roques D, Glaszmann JC, Grivet L: Genetic dissection of modern sugarcane cultivar (Savvharum spp.). I. Genome mapping with AFLP markers. Theor Appl Genet. 2001, 103: 84-97. 10.1007/s001220000390.

    Article  CAS  Google Scholar 

  5. Zhivotovsky LA: Estimating population structure in diploids with multilocus dominant DNA markers. Mol Ecol. 1999, 8: 907-913. 10.1046/j.1365-294x.1999.00620.x.

    Article  CAS  PubMed  Google Scholar 

  6. Carisio L, Cervella P, Palestrini C, DelPero M, Rolando A: Biogeographical patterns of genetic differentiation in dung beetles of the genus Trypocopris (Coleoptera, Geotrupidae) inferred from mtDNA and AFLP analysis. J Biogeogr. 2004, 31: 1149-1162. 10.1111/j.1365-2699.2004.01074.x.

    Article  Google Scholar 

  7. Albach DC, Schönswetter P, Tribsch A: Comparative phylogeography of the Veronica alpine complex in Europe and North America. Mol Ecol. 2006, 15: 3269-3286. 10.1111/j.1365-294X.2006.02980.x.

    Article  CAS  PubMed  Google Scholar 

  8. Bensch S, Ã…kesson M: Ten years of AFLP in ecology and evolution: why so few animals?. Mol Ecol. 2005, 14: 2899-2914. 10.1111/j.1365-294X.2005.02655.x.

    Article  CAS  PubMed  Google Scholar 

  9. Vos P, Hogers R, Bleeker M, Reijans M, Lee van de T, Hornes M, Frijters A, Pot J, Peleman J, Kuiper M, Zabeau M: AFLP: a new technique for DNA fingerprinting. Nucleic Acids Res. 1995, 23: 4407-4414. 10.1093/nar/23.21.4407.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  10. Ogden R, Thorpe RS: The usefulness of amplified fragment length polymorphism markers for taxon discrimination across graduated fine evolutionary levels in Caribbean Anolis lizards. Mol Ecol. 2002, 11: 437-445. 10.1046/j.0962-1083.2001.01442.x.

    Article  CAS  PubMed  Google Scholar 

  11. Blears MJ, De Grandis SA, Lee H, Trevors JT: Amplified fragment length polymorphism (AFLP): a review of the procedure and its applications. J of Industrial Microbiology and Biotechnology. 1998, 21: 99-114. 10.1038/sj.jim.2900537.

    Article  CAS  Google Scholar 

  12. Mueller UG, Wolfenbarger LL: AFLP genotyping and fingerprinting. Trends in Ecology and Evolution. 1999, 14: 389-394. 10.1016/S0169-5347(99)01659-6.

    Article  PubMed  Google Scholar 

  13. Meudt HM, Clarke AC: Almost forgotten or latest practice? AFLF applications, analyses and advances. Trends in Plant Science. 2007, 12: 106-117. 10.1016/j.tplants.2007.02.001.

    Article  CAS  PubMed  Google Scholar 

  14. Falush D, Stephens M, Pritchard JK: Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics. 2003, 164: 1567-1587.

    PubMed Central  CAS  PubMed  Google Scholar 

  15. Berres ME, Engels WR, Kirsch JAW: A method for genotyping ostensibly dominant markers in AFLP fingerprints. Genetics. in review

  16. Bonin A, Bellemain E, Eidesen Bronken P, Pompanon F, Brockmann C, Taberlet P: How to track and assess genotyping errors in population genetics studies. Mol Ecol. 2004, 13: 3261-3273. 10.1111/j.1365-294X.2004.02346.x.

    Article  CAS  PubMed  Google Scholar 

  17. Winnepenninckx B, Backeljau T, Dewachter R: Extraction of high-molecular-weight DNA from mollusks. Trends Genet. 1993, 9: 407-10.1016/0168-9525(93)90102-N.

    Article  CAS  PubMed  Google Scholar 

  18. Behlke MA, Huang L, Bogh L, Rose S, Devor EJ: Fluorescence and Fluorescence Applications. Integrated DNA Technologies. 2005, 1-13. [http://www.idtdna.com]

    Google Scholar 

  19. Vekemans X, Beauwens T, Lemaire M, Roldan-Ruiz I: Data from amplified fragment length polymorphism (AFLP) markers show indication of size homoplasy and of a relationship between degree of homoplasy and fragment size. Mol Ecol. 2002, 11: 139-151. 10.1046/j.0962-1083.2001.01415.x.

    Article  CAS  PubMed  Google Scholar 

  20. Althoff DM, Gitzendanner MA, Segraves KA: The utility of amplified fragment length polymorphisms in phylogenetics: A comparison of homology within and between genomes. Syst Biol. 56: 477-484. 10.1080/10635150701427077.

  21. Swofford D: PAUP*: Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 4.0b10. Sinauer Associates, Sunderland, Massachusetts, [http://www.sinauer.com/detail.php?id=8060]

  22. Nei M, Li WH: Mathematical model for studying genetic variation in terms of restriction endonucleases. Proc Natl Acad Sci USA. 1979, 76: 5269-5273. 10.1073/pnas.76.10.5269.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We dedicate this work and article to Wally Holznagel, retired director of the Johnson Molecular Systematics Laboratory at The University of Alabama, Tuscaloosa, Alabama. His continuous support and many helpful discussions and ideas were used in the formulation of the protocol and construction of this manuscript. We want to extend a sincere appreciation to the three anonymous reviewers who provided very helpful comments that greatly improved this manuscript. We thank Leslie Rissler for funding and she provided many helpful discussions and support during this research. Kevin LeVan, Jonathan Liu, and David Hulce from GeneMarker® were instrumental in the analysis by providing technical advice and support. All electopherograms were generated at the DNA sequencing facility at the University of Pennsylvania and a special thanks to Erik Toorens for his timely attention to our samples and questions. Additional thanks to Phil Harris, Carlos Camp, and Robert Makowsky for their support, advice, and discussions about previous versions of the manuscript. All salamander research was approved by the Institutional Animal Care and Use Committee (IACUC) protocol number 05-242-3 to LJR at The University of Alabama. This research was funded by: American Museum of Natural History grant awarded to JAW, Conservation Grant from the North American Benthological Society awarded to LTJ, NSF DEB 0414033 awarded to LJR, NSF IGERT (DGE-9972810) awarded to Amelia Ward, and The University of Alabama.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jessica A Wooten.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

JAW and LTJ contributed equally to all parts of methodological development and manuscript preparation. Both authors read and approved the final manuscript.

Authors’ original submitted files for images

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Wooten, J.A., Tolley-Jordan, L.R. Validation of phylogenetic signals in amplified fragment length data: testing the utility and reliability in closely related taxa. BMC Res Notes 2, 26 (2009). https://doi.org/10.1186/1756-0500-2-26

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/1756-0500-2-26

Keywords