In-silico molecular designs to treat neurologic and ophthalmologic diseases caused by sorbitol excess: engineering the Agrobacterium vitis protein

This work presents the design of a new protein based on the adenosine triphosphate-binding cassette (ABC) transporter solute binding protein (SBP) derived from Agrobacterium vitis, a gram-negative plant pathogen. The Protein Data Bank in Europe’s dictionary of chemical components was utilized to identify sorbitol and D-allitol. Allitol bound to an ABC transporter SBP was identified in the Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB). Wizard Pair Fitting and Sculpting tools in PyMOL were used to replace bound allitol with sorbitol. PackMover Python code was used to induce mutations in the ABC transporter SBP’s binding pocket, and changes in free energy for each protein-sorbitol complex were identified. The results indicate that adding charged side chains forms polar bonds with sorbitol in the binding pocket, thus increasing its stabilization. In theory, the novel protein can be used as a molecular sponge to remove sorbitol from tissue and therefore treat conditions affected by sorbitol dehydrogenase deficiency. Supplementary Information The online version contains supplementary material available at 10.1186/s13104-023-06367-2.


Introduction
The polyol pathway is a biological mechanism utilized by tissues to convert glucose to fructose through converting glucose into an intermediary compound, sorbitol [1]. Sorbitol is the alcohol counterpart of glucose [2]. Its structure effectively traps glucose within a cell, and is converted to fructose via sorbitol dehydrogenase (SORD), as seen in Fig. 1 [3,4]. Tissues with insufficient sorbitol dehydrogenase can suffer from intracellular sorbitol accumulation, and subsequently osmotic damage. Excess sorbitol may deposit in the eyes leading to cataracts or retinopathy, or damages nerves such as in peripheral neuropathy, similar to the effect of chronic hyperglycemia in diabetes [5][6][7].
Sorbitol dehydrogenase deficiencies currently affect one in 100,000 individuals and primarily manifest as the most common form of autosomal recessive peripheral neuropathy [8]. In patients with diabetes, excess glucose in the blood significantly increases the production of sorbitol and can lead to cell lesions. In the lens of the eye, diabetic patients can develop cataracts up to 20 years earlier [9] because of osmotic imbalance from excess sorbitol and oxidative damage [10]. These conditions can have drastic effects on quality of life and will require lifestyle changes and long-term management if not addressed in a timely manner.
Nature's proteins provide an array of engineering possibilities. Through in-silico modeling, proteins can be enhanced (i.e. via mutations), evolving to perform novel functions [11][12][13]. In-silico molecular design continues to advance, offering improved optimization strategies through machine learning algorithms, peptide modeling, and binding energy analysis [14].
To address the ophthalmic and neurological pathologies associated with sorbitol dehydrogenase deficiency, we propose a protein capable of binding sorbitol with high affinity and specificity. This protein, the adenosine triphosphate-binding cassette (ABC) transporter solute binding protein (SBP), is derived from the microbe Agrobacterium vitis (also known as Allorhizobium vitis), a Gram-negative, motile, plant pathogen that infects grapevines [11]. The Agrobacterium family has recently been shown to cause certain pathologies like endophthalmitis and pneumonia, delusions from Morgellans disease, and cancer [15]. However, there have been limited findings on A. vitis for ophthalmological or neurological pathologies.
The ABC transporter SBP itself has a 'Venus flytrap'like mechanism that allows for large conformational changes in its binding domains to allow for the encapsulation of ligands [16]. The structure of the ABC transporter SBP is specific for organic compounds such as allitol or the amino sugars glucosamine and galactosamine [16,17].
The small molecule allitol (i.e. D-allitol) has a molecular formula of C 6 H 14 O 6 . It is a natural ligand for the ABC transporter SBP. Sorbitol, a structurally similar small molecule, has the same molecular formula as allitol. Both molecules have six-carbon with each carbon attached to a hydroxyl (-OH) group. The difference between allitol and sorbitol is simply the stereochemistry of a single hydroxyl group at the third carbon position, as seen in Fig. 2. The similarity in these two structures suggest that minor modifications to the binding pocket of the ABC transporter SBP can be used as a molecular sponge that preferentially binds sorbitol over allitol. This molecular sponge could decrease excess intracellular sorbitol and limit disease progression in affected individuals. The development of such protein could be utilized as a starting framework for potential candidate medications for the treatment of cataracts, retinopathy, or peripheral neuropathy in individuals suffering from sorbitol dehydrogenase deficiency.

Generating files for use in PyMOL
Utilizing the Protein Data Bank (PDB) in Europe's Chemical Components in the PDB, we identified that sorbitol, code name of SOR [18], is a related compound of D-allitol, code name of X9X [19]. We then searched the RCSB (Research Collaboratory for Structural Bioinformatics Protein Data Bank 2021) PDB ('X9X' 2021, 'SOR' 2021) for a protein that contained allitol as a ligand in its structure, protein code name 4WT7 ('Crystal structure of an ABC transporter solute binding protein (IPR025997) from Agrobacterium Vitis (Avi_5165, Target EFI-511223) with bound allitol') [17]. We were able to download a '.pdb' file directly for 4WT7. However, this protein was a dimer of two identical monomers. We then deleted one monomer and saved the protein as a new '.pdb' file. As X9X was contained within the structure of 4WT7, we isolated the ligand in the file and saved it as a separate '.pdb' file. Finally, a '.sdf ' file for SOR was acquired which was run through an online web service integrated with OpenBabel to convert the file to a '.mdl' file format [20]. This file was converted to a '.pdb' file using the 'molfile-2params' code.
The above-mentioned '.pdb' files were used in conjunction with PyMOL to calculate energy scores, run experiments, and generate images. A summary of these '.pdb' files is contained in Table 1.

Docking ligand to protein in PyMOL
For docking, we first opened protein.pdb in PyMOL and then imported sorbitol.pdb the file into the open PyMOL window [21]. We utilized a combination of the Wizard Pair Fitting and Wizard Sculpting tools in PyMOL to align the carbon backbone of the sorbitol molecule from the sorbitol.pdb file to the carbon backbone of the allitol molecule from the .pdb file (Fig. 3). The Wizard Pair Fitting tool allowed for movement and orientation of the sorbitol molecule while the Wizard Sculpting tool allowed for rotation of bonds to align the hydroxyl groups of sorbitol to those of allitol. This was an iterative process as sometimes using the Wizard Sculpting tool would cause unwanted movement of the sorbitol molecule, so the Wizard Pair Fitting tool was used again to re-align the carbon backbone, and once realigned, identified additional need for the Wizard Sculpting tool. Once the alignment of both the carbon backbone and hydroxyl groups (all but the hydroxyl group of the third carbon, as that differs in stereochemistry) was accomplished, the atoms of the allitol molecule were deleted from the complex. The subsequent protein-sorbitol complex was saved as a new molecule and '.pdb' file.

Minimizing protein-ligand complex
For minimization, we used the MinMover code to find a local energy minimum of the new protein and the Fas-tRelax code to find a structure with lower energy than the new protein. The results of each of these minimization methods are discussed in the Results section.

Mutating protein-ligand complex
Given that sorbitol fits in the binding pocket of the protein where allitol once was, we hypothesize that the amino acid side chains stabilizing bound allitol would similarly stabilize much of sorbitol; however, the difference in stereochemistry of the third carbon hydroxyl group between allitol and sorbitol suggests that modification to the protein residues in binding pocket could preferentially bind one ligand over the other. To determine which residues to change, we performed an iterative experiment. We first used PyMOL to identify all residues of the protein within 4 Angstroms of the bound ligand sorbitol. We then used PackMover Python code to change these residues that enhance residue-residue or residue-ligand interactions to further stabilize sorbitol in the binding pocket. To maintain control over design, we generated a protein with only specific amino acids in the binding pocket that was mutated. We did this by running code to change the 'resfiles' values of these amino acids Sorbitol, which differs in stereochemistry from allitol only at the third carbon hydroxyl group, is the ligand that the ABC transporter SBP can be engineered for preferential binding from NATRO (no mutation, no repacking) to ALLAA (all amino acids, all repacking) to subsequently change the amino acids. This method generated a protein with all mutated residues in the binding pocket and selectively mutated residues in the binding pocket. We further analyzed the selectively mutated protein.

Selective mutation experiment
While in the binding pocket of the ABC transporter SBP, sorbitol has direct polar interactions with five residues in the binding pocket. There are a total of seven bonds created between the ligand and the five amino acid side chains. The details of these interactions are detailed in Table 2. The amino acids in the binding pocket of the ABC transporter SBP also include five side chains that appear to have the potential to participate in polar bonds, which were the target of the selective mutation experiment. Following the experiment, the polar bonds that were created among these residues are listed in Table 3.
For the residues with polar bonds before mutation, once the mutation occurs the number of polar bonds does not change. Additionally, while after FastRelax the number of polar bonds decreases, it is imperative to note that there is an increase in other polar bonds formed among the residues that were identified as ones that had the potential for polar bonds, and thus an overall increase in the number of polar bonds formed (Fig. 4).

Free energy changes associated with each complex
Different blocks of Python code were run at each step of the experiment to calculate energy scores for the new protein and subsequently generated structures. The changes in free energy are detailed in Table 4.

Thermodynamic energy cycle
Given the thermodynamic equation below, which estimates binding free energy we derive the following equation which results in a unitless number that describes the thermodynamic cycle of energy for our protein. Plugging in numbers, for our new protein and protein-sorbitol complex versus our selectively mutated protein and protein-sorbitol complex with five residues mutated, we find that As∆∆G bind < 0, the mutated version of the complex is thermodynamically favorable, and thus we can conclude that the selective mutations allowed for superior proteinligand binding compared to the starting protein. It is important to note, however, that stabilizing the protein without the ligand in place such as with MinMover prior to mutation would not optimize for the protein when it  Table 2 Residues with polar bonds before mutation. The five residues listed create a total of seven polar bonds with the ligand sorbitol when sorbitol is in the binding pocket of the protein prior to mutating any residues in the binding pocket of the protein  is bound to the ligand. Thus, it is important that in the above calculation we used the energy score of the new protein prior to any minimization code (MinMover, Fast Relax, etc.) being run (Fig. 5).

Conclusion
By applying PyRossetta's suite of energy minimization algorithms to our protein of interest, we identified a protein with improved stability for binding sorbitol as compared to its original form. These algorithms stochastically induce amino acid residue mutations and conformational changes to identify proteins that are more energetically favorable. One reason our engineered  protein is energetically favorable is due to the increased polar bonds. While other combinatorial changes can be attributed to the improved binding energy, these factors cannot be directly identified as they are a result of the algorithms in the PyRosetta suite.
Often, aldohexoses such as D-glucose, are found in a ringed form instead of a linear form. Given this information, there is only a small number of aldohexoses that are naturally occurring in their linear form that could compete with sorbitol for the binding spot of an engineered protein tuned for sorbitol selectivity [22][23][24]. Furthermore, as a therapy targeting diseases with sorbitol excess, we can ensure that the predominant substrate for binding the engineered protein will be sorbitol, rather than an alternative aldohexose that would possibly be found in the same region in relatively smaller amounts [22]. We can therefore ensure high selectivity for sorbitol.
Based on the structure of sorbitol, we hypothesized that by changing non-charged side chains to charged side chains within the binding pocket, the resulting conformational changes could allow the binding pocket to form new polar bonds with the hydroxyl groups in sorbitol and thus achieve greater stabilization [25]. As hydroxyl groups contain an oxygen with a partial negative charge and a hydrogen with partial positive charge, we can predict a change for a particular binding pocket amino acid side chain given a different one may be more energetically favorable. Interestingly, no mutative changes in the binding pocket itself were found to be more energetically favorable (Table 2).
Of the five selective changes within 4 Angstroms of the binding pocket (Table 3), the one residue that was hydrophobic changed to positive (PHE to LYS) may align with the hypothesis that polar bonds are more energetically favorable. For the two residues that were negative changed to neutral (ASP to PRO, ASP to TYR), and the remaining mutation (CYS to ILE), these may have occurred to further stabilize the structure and accommodate other changes, though indirectly. The last case (VAL to VAL) signified no mutation, which suggests that this residue, regardless of location, does not participate in the stabilization. Overall, the selective mutations were consistent with our predictions that adding more polar bonds to the binding pocket improves binding with sorbitol and improves stability of the protein. After Pack-Mover mutations and FastRelax minimization, there were 10 polar bonds present amongst the 10 residues versus 7 polar bonds at the starting point. The lack of new polar bonds in the mutated protein and the gain of polar bonds in the FastRelax protein suggest that of the two strategies to decrease free energy through structural minimization of the protein-ligand complex, mutating proteins is less effective than rotational conformation. The results of the free energy changes in Table 4 corroborated the assumption that each strategy yielded protein-sorbitol complexes that were more energetically favorable than the original.
The mutations and conformations discussed above may be beneficial as they stabilize the binding of the ATP transporter SBP with sorbitol. This selectively mutated protein isolated from Agrobacterium vitis is a potential tool for sorbitol sequestration which could be used as a molecular sponge to preferentially bind and remove sorbitol from tissue, and thus potentially be utilized as a treatment for sorbitol dehydrogenase deficiency. Future development of this biologic and subsequent in vitro and in vivo experimentation can be utilized to study its efficacy.

Limitations
The theoretical mutations developed in this work were developed by utilizing a Python modeling interface based on an underlying stochastic algorithm. It can be postulated that the various components of the code could be rerun and return further novel potential mutations with superior or inferior binding outcomes. Furthermore, while we chose to selectively administer mutations to the binding pocket, an automatic mode for determining which residues should be changed could provide for different outcomes. Future studies which incorporate in vitro and subsequently in vivo experimentation are necessary for effect confirmation. General limitations regarding biologics including side effects should be considered and further explored in future studies.