An enhanced method for sequence walking and paralog mining: TOPO® Vector-Ligation PCR
© Davis et al; licensee BioMed Central Ltd. 2010
Received: 10 November 2009
Accepted: 4 March 2010
Published: 4 March 2010
Skip to main content
© Davis et al; licensee BioMed Central Ltd. 2010
Received: 10 November 2009
Accepted: 4 March 2010
Published: 4 March 2010
Although technological advances allow for the economical acquisition of whole genome sequences, many organisms' genomes remain unsequenced, and fully sequenced genomes may contain gaps. Researchers reliant upon partial genomic or heterologous sequence information require methods for obtaining unknown sequences from loci of interest. Various PCR based techniques are available for sequence walking - i.e., the acquisition of unknown DNA sequence adjacent to known sequence. Many such methods require rigid, elaborate protocols and/or impose narrowly confined options in the choice of restriction enzymes for necessary genomic digests. We describe a new method, TOPO® Vector-Ligation PCR (or TVL-PCR) that innovatively integrates available tools and familiar concepts to offer advantages as a means of both targeted sequence walking and paralog mining.
TVL-PCR exploits the ligation efficiency of the pCR®4-TOPO® (Invitrogen, Carlsbad, California) vector system to capture fragments of unknown sequence by creating chimeric molecules containing defined priming sites at both ends. Initially, restriction enzyme-digested genomic DNA is end-repaired to create 3' adenosine overhangs and is then ligated to pCR4-TOPO vectors. The ligation product pool is used directly as a template for nested PCR, using specific primers to target orthologous sequences, or degenerate primers to enable capture of paralogous gene family members. We demonstrated the efficacy of this method by capturing entire coding and partial promoter sequences of several strawberry Superman-like genes.
TVL-PCR is a convenient and efficient method for DNA sequence walking and paralog mining that is applicable to any organism for which relevant DNA sequence is available as a basis for primer design.
Efforts to obtain desired gene and promoter sequences often rely on exploitation of fragmentary genomic or cDNA sequence information available from homologous or heterologous sources as a basis for PCR primer design. Various PCR-based sequence walking techniques have been developed for acquiring previously unknown genomic sequence flanking a known site [1–5]. All such techniques share a common strategy: creation of a distal priming site or sites for use in conjunction with priming sites in known genomic sequence. However, these techniques vary in how the distal priming site is created.
We describe TVL-PCR and explain how we used it to obtain full length sequence and partial promoter sequence of several strawberry Superman-like genes. We also successfully used TVL-PCR for paralog mining (results not shown), which is the amplification of multiple members of a gene family using degenerate primers based on conserved sequences as priming sites.
DNA was isolated from unexpanded leaf tissue of Fragaria virginiana accession L2 (CFRA 1995) as described , except that no chloroform:octanol solution was included in the microfuge tube to which CTAB slurry was transferred. Per reaction, 400 ng of genomic DNA was digested with 20 U of Eco RI, Bam HI, or Hin dIII (New England Biolabs, Ipswich, Massachusetts) in a 40 μl reaction that was incubated overnight at 37°C. Employed restriction enzymes must produce recessed 3' (5' overhangs) or blunt ends. Digestion was verified by electrophoresis of 100 ng digested genomic DNA and undigested comparator on a 1% agarose TBE gel.
End-repair employed 20 μl (200 ng) of each digested DNA sample with 1 μl of 10 mM dNTP mix, 1.3 μl sterile water, 2.5 μl of EconoTaq® buffer (Lucigen, Middleton, Wisconsin) and 0.2 μl (1 U) EconoTaq DNA polymerase. Reactions were incubated at 72°C for 30 minutes to fill in recessed 3' ends of cut sites, add a 3' adenosine overhang, and heat-inactivate the restriction enzyme.
End-repaired DNA was ligated to the pCR4-TOPO vector (Invitrogen) using 4.5 μl (36 ng) end-repaired DNA solution, with 0.5 μl (5 ng) of TOPO vector and 1 μl of supplied salt solution. The reaction was gently mixed and incubated at room temperature for one hour.
The design of the gene-specific primers relied on alignments comprised of Superman-like sequences from four heterologous sources (Petunia, Nicotiana, Arabidopsis, and Malus), and strawberry Superman-like sequences obtained via conventional PCR amplification using degenerate primers targeted to sites conserved among the four heterologous sequences (results not shown). Two strategies were employed. For paralog mining of new Superman-like genes, degenerate primers were targeted to genomic sites identified as conserved among the heterologous and strawberry sequences. For sequence-specific walking to extend initially obtained gene segments, sequence-specific primers were targeted to genomic sites that were not conserved among the aligned heterologous and strawberry sequences. Genomic primers were always sited at least 50 bases upstream of the transition point from known to unknown sequence, to provide a sufficient read of known sequence in the resulting TVL-PCR product to confirm sequence identity and continuity.
The 25 μl first-round TVL-PCR reactions contained 3 μl ligation reaction template, 0.5 μl of 20 μM G1 genomic primer, 0.5 μl vector primer, 0.1 μl (0.5 U) AccuPrime™ Taq DNA Polymerase High Fidelity (Invitrogen), and 2.5 μl 10× AccuPrime PCR buffer II. The first-round product (10 μl) was visualized on a 2% agarose 1% TBE gel run at 4.1 V/cm for 80 minutes. The second-round of TVL-PCR was performed with 1 μl of first-round product as the template, and 0.5 μl of 20 μM stock of both the nested G2 genomic primer and the appropriate vector primer. If the first-round vector primer was M13F or T3, then T3 was used as the second-round vector primer. If the first-round vector primer was M13R or T7, then T7 was used as the second-round vector primer.
The amplification products from each second round TVL-PCR reaction were cloned using the Invitrogen TOPO TA Cloning® kit for sequencing, per the manufacturer's instructions. Transformed cells were plated on LB agar plates containing 50 μg/ml ampicillin and were grown at 37°C overnight. Colony PCR was performed using M13 primers provided with the TOPO cloning kit to confirm insert size. Plasmids were isolated from clones of interest using the Promega Wizard®Plus SV Minipreps DNA Purification System (Promega, Madison, WI). The plasmid inserts were then sequenced bidirectionally on an Applied Biosystems 3130 Genetic Analyzer (Applied Biosystems, Foster City, CA), using M13 forward and M13 reverse sequencing primers. The T3 and T7 sequencing primers cannot be used for this purpose, because their priming sites may be represented twice in the final TOPO clone: once in the cloning vector itself and once in the cloned product of second round TVL-PCR (Figure 2).
Use of degenerate primers targeted to genomic sites conserved among heterologous Superman sequences resulted in acquisition of multiple candidate clones from strawberry accession L2 (results not shown), including that of clone 143-2-1. The latter sequence provided the basis for design of site-specific primers used for sequence walking in accession L2, as described below.
Tm (°C) *
14321 5' F1
14321 5' F2
14321 3' F1
14321 3' F2
For walking in the 3' direction, the first round of TVL-PCR employed G1 primer 14321 3'F1 paired only with the T3 vector primer. The product from this reaction (Figure 3B - top) was used as template for the second round of TVL-PCR using the nested G2 primer 14321 3'F2 in conjunction with the T3 vector primer. Shotgun cloning of the second round TVL-PCR products (Figure 3B - bottom), followed by colony PCR and sequencing of two appropriately sized products (> 500 bp), yielded one product that contained the targeted sequence. The respective PCR product size corresponded to the boxed gel band (Figure 3B - bottom). This product was slightly less than 2 kb in length, and extended well into the 3' UTR of the targeted gene.
Overall, a total of five clones were sequenced from shotgun cloned and size-selected TVL-PCR products, of which one provided targeted sequence extension in the 5' direction and another in the 3' direction. The remaining three sequenced clones were not the targeted sequence, but displayed a vector primer sequence at one end and a respective genomic primer sequence at the other, indicating that each arose from non-specific priming by the genomic primer.
Using methods similar to those described above, TVL-PCR was also used successfully to extend the sequence of Superman-like clone 7266 upstream far enough to surpass the putative start codon (results not shown). In addition, the use of degenerate primers resulted in the acquisition of 5' and 3' sequences from three additional Superman family sequences in strawberry. All strawberry Superman-like genes isolated from chromosome walking via TVL-PCR have been deposited in GenBank under accession numbers, GU830924: 7266 5' walk, GU830926: 14321 5' walk, and GU830925: 14321 3' walk.
TVL-PCR adds a useful new option to the toolkit of methodological choices for sequence walking. By exploiting T/A ligation by using the pCR4-TOPO vector as a linker, TVL-PCR expands the spectrum of restriction enzymes employable for digestion of genomic DNA by eliminating dependence on corresponding restriction sites in the vector linker as encountered in Single Specific Primer PCR  and Rapid Amplification of Genomic DNA Ends .
Beneficially, methods that employ T/A ligation, including TVL-PCR and T-Linker PCR , preclude the possibility of genomic and/or vector fragment re-ligation to each other, or self-ligation (i.e., circularization). Because genomic DNA fragments with 3' adenosine overhangs can only ligate to a molecule having a 3' thymidine overhang, chimeric ligation constructs comprised of multiple genomic fragments are also precluded.
With ligation-dependent methods, duplicate linker ligation at both ends of a genomic fragment would create opportunity for "single primer amplification" via linker priming at both ends. The TVL-PCR procedure minimizes this problem by maintaining a high genomic DNA to vector ratio, minimizing the number of genomic fragments that acquire ligated vector molecules at both ends. We used 36 ng of genomic DNA in the ligation reaction, which after complete digestion with a six-base restriction enzyme should result in 8.2 billion fragments present in the ligation reaction. At a concentration of 10 ng/μl, 0.5 μl of TOPO vector yields 1.2 billion molecules, and thus roughly a ratio of seven genomic DNA fragments to one vector molecule. In various applications of the TVL-PCR technique, we have yet to encounter a product primed by a vector primer at both ends.
TVL-PCR is an efficient and effective method for genomic sequence walking. It offers significant advantages, allowing choice among a large selection of restriction enzymes, and requiring only small amounts of template DNA. Using TVL-PCR, we have isolated entire coding and partial promoter sequences of several Superman-like genes from strawberry.
This work was supported in part by New Hampshire Agricultural Experiment Station Project NH00433. This is Scientific Contribution 2387 from the New Hampshire Agricultural Experiment Station (NHAES). We gratefully acknowledge the editorial assistance provided by Melanie E. Shields.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.