Structure-based design and construction of a synthetic phage display nanobody library
BMC Research Notes volume 15, Article number: 124 (2022)
To design and construct a new synthetic nanobody library using a structure-based approach that seeks to maintain high protein stability and increase the number of functional variants within the combinatorial space of mutations.
Synthetic nanobody (Nb) libraries are emerging as an attractive alternative to animal immunization for the selection of stable, high affinity Nbs. Two key features define a synthetic Nb library: framework selection and CDR design. We selected the universal VHH framework from the cAbBCII10 Nb. CDR1 and CDR2 were designed with the same fixed length as in cAbBCII10, while for CDR3 we chose a 14-long loop, which creates a convex binding site topology. Based on the analysis of the cAbBCII10 crystal structure, we carefully selected the positions to be randomized and tailored the codon usage at each position, keeping at particular places amino acids that guarantee stability, favoring properties like polarity at solvent-exposed positions and avoiding destabilizing amino acids. Gene synthesis and library construction were carried out by GenScript, using our own phagemid vector. The constructed library has an estimated size of 1.75 × 108. NGS showed that the amino acid diversity and frequency at each randomized position are the expected from the codon usage.
Nanobodies (Nbs) are single domain antibody fragments derived from heavy-chain antibodies found in members of the Camelidae family. They have several important advantages such as their small size, high solubility and stability. Remarkably, in spite of their monomeric binding region composed only of three hypervariable loops or CDRs (complementarity determining regions), Nbs can achieve affinities in the nanomolar and subnanomolar orders, just as clasical antibodies. Their modularity allows the easy generation of multivalent constructs for many different applications .
Antigen-specific Nbs are obtained mostly from inmune libraries, which requires animal immunization. In recent years, however, synthetic libraries are emerging as an attractive alternative for the selection of stable, high affinity Nbs, as recently reviewed by our group , while offering important advantages in terms of cost and speed. Two key features define a synthetic Nb library: framework selection and CDR design. A few recent works have relied on both sequence and structural data to define the CDR positions to be randomized, as well as the sets of amino acids (aa) to be introduced at those positions [3,4,5,6,7,8,9].
Here we describe the design and construction of a new synthetic nanobody library with a 14 amino acid-long CDR3, in which the three hypervariable loops were subjected to position-specific randomization schemes. The design follows a structure-based approach that seeks to maintain the high stability shown by the original framework-donor nanobody and increase the number of functional variants within the combinatorial space of mutations. To this aim we analyzed the structural role played by individual residues in defining CDR conformation or exposing their side chains for antigen binding.
Library design and construction
The synthetic library should be built on a scaffold shown to be highly stable, in particular to reducing conditions (as in the cytoplasm). Also, the framework should be capable of accepting many different CDRs while keeping high stability. Since current algorithms to predict protein stability still have important limitations , the safest option is to choose a camelid heavy-chain variable domain (VHH) framework already proven to possess the desired characteristics. We selected the “universal VHH framework” from the cAbBCII10 nanobody , which has been used previously as scaffold for the construction of Nb libraries [12,13,14] and not only its sequence is available, but also a few crystal structures containing CDR-grafted, the original and humanized versions of its framework (PDB entries 1ZMY , 3DWT and 3EAK , respectively.
CDR1 and CDR2 design
A main issue in our design was to keep the correct immunoglobulin domain folding and guarantee its stability, for which we thought important to keep the original cAbBCII10 internal packing as well as CDR1/CDR2 conformations. In consequence, both CDR1 and CDR2 were designed with a fixed length—the same as in cAbBCII10. Analysis of the available crystal structures allowed to assess the structural role played by different CDR residues and so guide the design. Buried CDR amino acids were kept as in the original nanobody since they are most likely important in maintaining the high framework stability. Therefore, only selected residues with surface-exposed side chains were subjected to randomization. Sequence variability was included by means of degenerate codons, represented in the sequence using the IUPAC degenerate base symbols . Codon usage was tailored to limit the presence of hydrophobic amino acids at these positions. Cysteines were not allowed at any position. A few CDR aa found to be highly or relatively conserved in nanobody sequences, or thought to be important in holding CDR conformation in cAbBCII10, were also kept unchanged. It should be noted that our CDR definition follows that of Chothia and coworkers, which for CDR1 includes positions outside the classical Kabat CDR definition .
Figure 1A and B shows the sequence positions selected for randomization in each of the CDRs, marked on the sequence of the original cAbBCII10 and on its crystal structure. The rationale followed to define whether or not a CDR position is randomized, as well as the implemented codon variability for each position, is presented in Table 1. It is worth noting that, as for other libraries described in the literature, our design is based on premises involving a degree of subjectivity.
For this library we chose a 14-long CDR3, which creates a “convex” binding site topology by bending over the flank that in classical antibodies is buried at the VH-VL interface [19, 20]. This represents an important difference as compared with a previously constructed library, carrying a 10 aa-long CDR3, as we analyze in the following section.
Three degenerate codons: VRN (coding for 8 polar or charged aa, and Gly), WMY (coding for Asn, Ser, Thr and Tyr in equal proportions) and the highly variable VNN (coding for 16 aa, excluding Cys, Phe, Trp and Tyr) alternate along most of CDR3, starting from its N-terminus (Table 1). The presence of five WMY codons places small aa (Asn, Ser, Thr) and Tyr every two positions, conferring conformational variability, while the high representation of Arg + Lys within the VRN and VNN codons increments their frequency in the alternating positions.
For the three C-terminal CDR3 residues (“n-2”, “n-1” and “n”) we chose particular degenerate codons to account for the aa frequency observed at these positions in nanobodies with long CDR3 loops, as well as for the structural role played by the particular amino acids found in the original cAbBCII10 Nb: Phe, Arg and Tyr (positions 115, 116 and 117, respectively).
For position “n” (117) we chose TMY, coding in equal proportions for Ser and Tyr—the two aa most commonly found at this position, with Tyr being the most frequent. In cAbBCII10, Tyr remains mostly packed with its OH group exposed. The side chain at the “n-1” position (116) is exposed to the solvent, so we chose the VRN codon. This residue holds also a backbone turn that bends down CDR3 on its C-terminal side. Finally, at the “n-2” position (115) in cAbBCII10, Phe is buried against a hydrophobic patch a the Nb flank, most likely playing an important structural role in the folding of CDR3. For this position we limited the repertoire to only two aa—Phe and Tyr (codon TWY). In fact, Tyr is the most frequent aa at this position. The final nucleotide sequence for the whole nanobody is shown in Fig. 1C.
Library construction and verification
Nanobody genes were cloned into our ad-hoc designed pMAC phagemid vector, whose map is shown in Additional file 1. This vector includes a pelB leader whose coding sequence contains a NcoI restriction site at the 3′ end. Other 3 unique restriction sites (EcoRI, BamHI and NotI) were added in tandem, followed by a sequence coding for a short linker (SGGGG) and a 6xHis tag, an amber stop codon and, finally, the M13 PIII protein. The synthesis of nanobody genes, library construction and next-generation sequencing (NGS) verification were carried out by GenScript (NJ, USA). Nb genes were cloned into the pMAC vector using the NcoI and NotI restriction sites, followed by transformation by electroporation of SS320 E. coli cells. The estimated library size was 1.75 × 108.
Figure 2A shows the obtained aa frequencies, as assessed by NGS, for all the randomized positions, together with the theoretically designed variability. At every position, the aa diversity is the expected from the codon usage, while the experimental aa frequencies are close to the theoretical ones.
Structural analysis of CDR3 conformations
In a previous work we constructed and tested against a panel of antigens a synthetic Nb library based also on the cAbBCII10 framework, but having an important structural difference with the one reported here: a shorter, 10 aa-long CDR3 (manuscript in preparation). It has been shown that different CDR3 lenghts create different binding site topologies . Here we assessed these structural differences for the two implemented CDR3 lenghts (10 and 14 aa), using the nanobody structural data available in the Protein Data Bank (PDB) [21, 22], which has been rapidly growing in the last few years.
Currently, the PDB contains > 600 entries including a nanobody structure, either alone or in complex with an antigen . Figure 2 shows the superposition of 33 and 37 nanobody crystal structures we found with either 10 (CDR3-10) or 14 aa-long (CDR3-14) CDR3 loops. CDR3-10 loops display a variety of conformations, going from the “upright” geometry adopted by most CDR H3 loops in classical antibodies to slightly or entirely bent loops (Fig. 2B). CDR-14 loops, on the other hand, adopt mostly bent conformations, with relatively few exceptions (Fig. 2C). Additionally, we used the NanoNet program  to construct structural models of 10,000 sequences of the CDR3-14 library, randomly generated based on the aa frequencies yielded by the degenerate codons used at each randomized position. Figure 2D shows the superposition of 100 randomly selected models, all of them displaying a bent CDR3 conformation. Overall, these analyses indicate that the two libraries display different binding site topologies, which may account for different “preferences” regarding antigen shape.
The huge amino acid variability introduced in the library creates also a huge variability in interaction networks between CDR amino acids, as well as between CDR and framework amino acids. For the longer CDR3-14 loops, the alternation of different hydrophobic and hydrophilic amino acids may define its particular bended conformation and also exert different effects in protein stability, because of the interactions of this loop with the framework flank that in conventional antibodies interact with the light chain VL domain.
By carefully selecting the positions to be randomized and tailoring the codon usage for each position, we sought to increase the number of functional combinations within the huge theoretical combinatory, keeping at particular positions amino acids that guarantee stability, favoring properties like polarity at solvent-exposed positions and avoiding destabilizing aa at certain positions. The design strategy presented here may be used on other stable frameworks with different CDR1-3 lengths.
We have yet to demonstrate that the constructed library can provide stable, high affinity nanobody binders for different antigens. This experimental work, though, has advanced for the CDR3-10 library, with very encouraging results (manuscript in preparation).
Availability of data and materials
The data generated or analyzed during this study were included in this article and its additional file.
Complementarity determining region
Camelid heavy-chain variable domain
Muyldermans S. A guide to: generation and design of nanobodies. FEBS J. 2021;288(7):2084.
Valdés-Tresanco MS, Molina-Zapata A, González Pose A, Moreno E. Structural insights into the design of synthetic nanobody libraries. Molecules. 2022 (manuscript accepted)
Moutel S, Bery N, Bernard V, Keller L, Lemesre E, de Marco A, et al. NaLi-H1: a universal synthetic library of humanized nanobodies providing highly functional antibodies and intrabodies. eLife. 2016;5(JULY):1–31.
McMahon C, Baier AS, Pascolutti R, Wegrecki M, Zheng S, Ong JX, et al. Yeast surface display platform for rapid discovery of conformationally selective nanobodies. Nat Struc Mol Biol. 2018. https://doi.org/10.1038/s41594-018-0028-6.
Zimmermann I, Egloff P, Hutter CAJ, Arnold FM, Stohler P, Bocquet N, et al. Synthetic single domain antibodies for the conformational trapping of membrane proteins. eLife. 2018. https://doi.org/10.7554/eLife.34317.
Sevy AM, Chen MT, Castor M, Sylvia T, Krishnamurthy H, Ishchenko A, Hsieh CM. Structure- and sequence-based design of synthetic single-domain antibody libraries. Protein Eng Des Sel. 2020;33:1–13. https://doi.org/10.1093/PROTEIN/GZAA028.
Zimmermann I, Egloff P, Hutter CAJ, Kuhn BT, Bräuer P, Newstead S, et al. Generation of synthetic nanobodies against delicate proteins. Nat Protoc. 2020;15(5):1707–41.
Zhao Y, Wang Y, Su W, Li S. construction of synthetic nanobody library in mammalian cells by DsDNA-based strategies. ChemBioChem. 2021;22:2957–65. https://doi.org/10.1002/CBIC.202100286.
Chen X, Gentili M, Hacohen N, Regev A. A cell-free nanobody engineering platform rapidly generates SARS-CoV-2 neutralizing nanobodies. Nat Commun. 2021;12(1):1–14.
Sanavia T, Birolo G, Montanucci L, Turina P, Capriotti E, Fariselli P. Limitations and challenges in protein stability prediction upon genome variations: towards future applications in precision medicine. Comput Struct Biotechnol J. 2020;18:1968–79.
Conrath KE, Lauwereys M, Galleni M, Matagne A, Frère J-M, Kinne J, et al. β-Lactamase inhibitors derived from single-domain antibody fragments elicited in the Camelidae. Antimicrob Agents Chemother. 2001;45(10):2807.
Wei G, Meng W, Guo H, Pan W, Liu J, Peng T, et al. Potent neutralization of influenza A virus by a single-domain antibody blocking M2 ion channel protein. PloS ONE. 2011;6(12). https://pubmed.ncbi.nlm.nih.gov/22164266/. Accessed 27 Nov 2021.
Yan J, Li G, Hu Y, Ou W, Wan Y. Construction of a synthetic phage-displayed Nanobody library with CDR3 regions randomized by trinucleotide cassettes for diagnostic applications. J Transl Med. 2014;12(1):1–12. https://doi.org/10.1186/s12967-014-0343-6.
Chi X, Liu X, Wang C, Zhang X, Li X, Hou J, et al. Humanized single domain antibodies neutralize SARS-CoV-2 by targeting the spike receptor binding domain. Nat Commun. 2020. https://doi.org/10.1038/s41467-020-18387-8.
Saerens D, Pellis M, Loris R, Pardon E, Dumoulin M, Matagne A, et al. Identification of a universal VHH framework to graft non-canonical antigen-binding loops of camel single-domain antibodies. J Mol Biol. 2005;352(3):597–607.
Vincke C, Loris R, Saerens D, Martinez-Rodriguez S, Muyldermans S, Conrath K. General strategy to humanize a camelid single-domain antibody and identification of a universal humanized nanobody scaffold. J Biol Chem. 2009;284(5):3273–84.
Cornish-Bowden A. Nomenclature for incompletely specified bases in nucleic acid sequences: recommendations 1984. Nucleic Acids Res. 1985;13(9):3021.
Al-Lazikani B, Lesk AM, Chothia C. Standard conformations for the canonical structures of immunoglobulins. J Mol Biol. 1997;273(4):927–48.
Mitchell LS, Colwell LJ. Comparative analysis of nanobody sequence and structure data. Proteins. 2018;86(7):697–706.
Mitchell LS, Colwell LJ. Analysis of nanobody paratopes reveals greater diversity than classical antibodies. Protein Eng Des Sel. 2018;31(7–8):267–75.
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. The protein data bank. Nucleic Acids Res. 2000. http://www.rcsb.org/pdb/status.html. Accessed 11 May 2020.
Burley SK, Berman HM, Bhikadiya C, Bi C, Chen L, di Costanzo L, et al. RCSB protein data bank: biological macromolecular structures enabling research and education in fundamental biology, biomedicine, biotechnology and energy. Nucleic Acids Res. 2019;47(D1):D464–74.
Dunbar J, Krawczyk K, Leem J, Baker T, Fuchs A, Georges G, et al. SAbDab: the structural antibody database. Nucleic Acids Res. 2014;42(D1):D1140–6.
Cohen T, Halfon M, Schneidman-Duhovny D, Rachel T, Benin S. NanoNet: rapid end-to-end nanobody modeling by deep learning at sub angstrom resolution. bioRxiv. 2021. https://doi.org/10.1101/2021.08.03.454917v1.
Library construction and work by AMZ was supported by Minciencias (Colombia) (Grant No. 849-2017).
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Moreno, E., Valdés-Tresanco, M.S., Molina-Zapata, A. et al. Structure-based design and construction of a synthetic phage display nanobody library. BMC Res Notes 15, 124 (2022). https://doi.org/10.1186/s13104-022-06001-7
- Synthetic nanobody library
- Phage display
- Structure-based design