A split and rearranged nuclear gene encoding the iron-sulfur subunit of mitochondrial succinate dehydrogenase in Euglenozoa
© Gray et al; licensee BioMed Central Ltd. 2009
Received: 18 December 2008
Accepted: 03 February 2009
Published: 03 February 2009
Analyses based on phylogenetic and ultrastructural data have suggested that euglenids (such as Euglena gracilis), trypanosomatids and diplonemids are members of a monophyletic lineage termed Euglenozoa. However, many uncertainties are associated with phylogenetic reconstructions for ancient and rapidly evolving groups; thus, rare genomic characters become increasingly important in reinforcing inferred phylogenetic relationships.
We discovered that the iron-sulfur subunit (SdhB) of mitochondrial succinate dehydrogenase is encoded by a split and rearranged nuclear gene in Euglena gracilis and trypanosomatids, an example of a rare genomic character. The two subgenic modules are transcribed independently and the resulting mRNAs appear to be independently translated, with the two protein products imported into mitochondria, based on the presence of predicted mitochondrial targeting peptides. Although the inferred protein sequences are in general very divergent from those of other organisms, all of the required iron-sulfur cluster-coordinating residues are present. Moreover, the discontinuity in the euglenozoan SdhB sequence occurs between the two domains of a typical, covalently continuous SdhB, consistent with the inference that the euglenozoan 'half' proteins are functional.
The discovery of this unique molecular marker provides evidence for the monophyly of Euglenozoa that is independent of evolutionary models. Our results pose questions about the origin and timing of this novel gene arrangement and the structure and function of euglenozoan SdhB.
Succinate dehydrogenase (SDH, Complex II) is a membrane-anchored protein complex of the mitochondrial and bacterial electron transport chain that catalyzes the oxidation of succinate to fumarate, and that reduces FAD to FADH2 in the process (although it is capable of the reverse reaction under favorable conditions). High-resolution crystal structures of Complex II from bacterial (E. coli; ), avian (chicken; ) and mammalian (pig; ) sources demonstrate that it is a heterotetramer consisting of the succinate-oxidizing, matrix-associated, flavoprotein subunit (SdhA), an electron transfer iron-sulfur subunit (SdhB) and two hydrophobic membrane anchors (SdhC and SdhD) that provide the binding site for ubiquinone and are required for integration of the complex into the inner mitochondrial membrane. SdhA-D are nucleus-encoded in a wide variety of eukaryotes, including mammals, whereas SdhB-D are specified by the gene-rich mitochondrial genomes of certain protists such as red algae and jakobid flagellates . SdhA invariably appears to be nucleus-encoded.
Euglena gracilis is a free-living, flagellated eukaryotic microbe that contains a plastid likely acquired through the engulfment of a green alga . A monophyletic 'Euglenozoa' clade comprising Euglena (and related euglenids) along with two aplastidic lineages, the kinetoplastids (encompassing trypanosomatids and bodonids) and the predominantly free-living diplonemids, has been postulated principally on the basis of shared ultrastructural features, including disc-shaped mitochondrial cristae and flagellar paraxonemal rods . Phylogenetic reconstructions based on small subunit ribosomal RNA (SSU rRNA; ) and protein  sequences established that these physiologically and ecologically disparate taxa likely comprise a (potentially early-branching) monophyletic group. However, the well-documented effects of rapid rates of sequence change, along with the acquisition of a secondary endosymbiont  (evidenced by the presence in Euglena of a plastid with three surrounding membranes) and possible ephemeral, cryptic endosymbioses , have complicated reconstructions of Euglena's evolutionary history. In particular, the transfer of endosymbiont-derived genes to the nucleus has, in effect, yielded a mosaic nuclear genome displaying characteristics of all constituent sources . Moreover, the internal branching patterns within Euglenozoa are still not completely resolved , although phylogenies based on conserved protein genes seem to be consistent in placing euglenids at the base of Euglenozoa, with diplonemids and kinetoplastids forming a later diverging sister group .
Here we report that in E. gracilis, the nucleus-encoded sdhB gene is split into two independently transcribed (and presumably independently translated) subgenic modules whose products correspond to the N-terminal and C-terminal halves (referred to here as SdhB-n and SdhB-c, respectively) of a typical SdhB protein. Moreover, in various trypanosome species, we have identified separate genes encoding predicted proteins corresponding to SdhB-n and SdhB-c. The splitting of sdhB in Euglena and trypanosomatids is an example of a unique molecular character that specifically unites these two phylogenetic groups and raises interesting questions about the evolution and function of euglenozoan SdhB.
Results and discussion
Relatively few genomic data are available for Euglena. Neither nuclear nor mitochondrial genome sequencing projects are currently being undertaken, and only three mitochondrion-encoded protein-coding genes (cox1, cox2 and nad6) have been identified thus far [[12, 13], GenBank:AF156178]. Nevertheless, the construction and sequencing of EST libraries generated from mature mRNAs is being exploited to better understand the biochemistry and evolution of this organism. The conserved 24-nucleotide 5' spliced leader (SL) sequence characteristic of Euglena nucleus-encoded mRNAs  confers a specific advantage in that its presence in an EST confirms that the translated sequence encompasses the complete N-terminus of the corresponding protein. This information is important in predicting the subcellular localization of a given protein product, as the signals required for targeting proteins to various subcellular compartments, including mitochondria, are frequently located at protein N-termini.
The discovery of split genes encoding proteins that function within mitochondria is not without precedent. For instance, cytochrome oxidase subunit 2 (Cox2) in the green algae Chlamydomonas reinhardtii and Polytomella sp.  and in several apicomplexan parasites  and dinoflagellates  is a nucleus-encoded heterodimer specified by two separate subgenic modules. In Chlamydomonas, the N-terminal portion of Cox2 has been shown to contain a cleavable N-terminal mTP, whereas the C-terminal unit does not . This situation parallels that reported here for trypanosome SdhB-c, which does not appear to contain a canonical cleavable mTP. In the case of chlamydomonad algae, it has been proposed that a 20-amino acid C-terminal extension in Cox2a (the N-terminal unit) and a 42-amino acid N-terminal extension in Cox2b might facilitate the functional interaction of these two subunits . In the absence of biochemical evidence confirming the length of the mitochondrial targeting peptide, it is not possible to determine unequivocally whether or not trypanosome SdhB-c has an N-terminal extension. On the other hand, SdhB-n from Euglena does possess a C-terminal extension of ~35 amino acids, whereas the corresponding trypanosome SdhB-n C-terminal extension is ~105 residues long. Sequence alignments do not indicate any significant similarity between the Euglena and trypanosome extensions. As proposed for Cox2 in chlamydomonads, these extensions might allow the dimerization of SdhB-n and SdhB-c in euglenozoans, although bioinformatic analysis does not suggest the presence of obvious protein-protein interaction domains.
From a structural perspective, the split in the SdhB sequence in Euglena and trypanosomatids occurs in a region that might be particularly tolerant of such disruption (Figure 3). SdhB contains three iron-sulfur (Fe-S) centers, arranged in a linear chain, that function to transport electrons from SdhA to the membrane-integrated subunits . SdhB from E. coli is organized into two domains: an N-terminal domain containing a [2Fe-2S] cluster that forms a fold similar to plant-type ferredoxins and a C-terminal domain that houses the [3Fe-4S] and [4Fe-4S] clusters with a fold similar to bacterial ferredoxins . SdhB-n from Euglena contains a predicted Fer2 domain whereas SdhB-c is predicted to contain two Fer4 domains, indicating that the break between Euglena SdhB-n and SdhB-c occurs in a region corresponding to the junction between the two E. coli domains. Moreover, protein alignments demonstrate that all of the Cys residues required for co-ordination of the three Fe-S clusters in E. coli SdhB are accounted for when both SdhB-n and SdhB-c from Euglenozoa are considered. These observations lend further support to the notion that these separate protein halves are functional, as rearrangement occurring within protein domains and/or loss of Fe-S cluster ligands would likely not be tolerated.
Notably, the amino acid sequences of SdhB-n and SdhB-c from Euglenozoa are exceptionally divergent in comparison with SdhB characterized to date in any other organism. In fact, many of the otherwise universally (or nearly universally) conserved residues have been substituted with different ones in Euglenozoa. For instance, a universally conserved Arg (R56 in E. coli) is Cys in SdhB-n of both Euglena and trypanosomatids (Figure 2A, a). Conversely, the conserved Cys corresponding to C154 in E. coli is Ser in Euglena and trypanosome SdhB-c (Figure 2B, b), as well as in SdhB from the unrelated malaria parasite, Plasmodium falciparum. The nearby Ser-Thr-Ser motif present in all other SdhB sequences examined here (corresponding to E. coli residues 156–158; Figure 2B, c) is Thr-Ala-Ala in Euglena. Although E. coli C154 is not directly responsible for coordinating Fe-S clusters in SdhB, the crystal structure suggests that it contributes a hydrogen bond to the thiol group, important in stabilizing the [4Fe-4S] cluster ligand C152 . It is thought that this H-bond maintains a higher midpoint potential in the [4Fe-4S] cluster. Interestingly, Cheng et al.  found a direct relationship between the midpoint potential of the [4Fe-4S] cluster and the turnover rates of succinate dehydrogenase, whereas Hudson et al.  found the inverse for the Fe-S subunit of E. coli fumarate reductase (Frd; an homologous enzyme that catalyzes the reduction of fumarate to succinate). Thus, the presence of C154 may favor the in vivo oxidation of succinate to fumarate, as opposed to the reverse reaction . E. coli FrdB, which has a lower [4Fe-4S] cluster midpoint potential than does E. coli SdhB, has a Leu residue instead of the E. coli C154 equivalent (Figure 2B, b) and a Tyr-Ala-Ala motif (Thr-Ala-Ala in Euglena) instead of Ser-Thr-Ser (Figure 2B, c). Thus, there exist some interesting parallels between the euglenozoan SdhB and E. coli FrdB sequences, although phylogenetic analyses (see additional file 3: SdhB phylogenetic tree) clearly demonstrate that SdhB-n and SdhB-c are SdhB (and not FrdB) homologs. Moreover, it is quite possible that the Ser in euglenozoan SdhB-c contributes a stabilizing hydrogen bond to the [4Fe-4S] cluster (equivalent to the function of C154 of E. coli) whereas Leu in FrdB could not. Taken together, the euglenozoan SdhB structure and sequence are intriguing, and emphasize the need for biochemical investigations to fully understand the function and structure of these split proteins.
Expressed sequence tags (ESTs) from E. gracilis strain Z were prepared as described in . ESTs encoding Euglena SdhB were identified by a tBLASTn  search of the taxonomically broad EST database (TBestDB; ) and GenBank, using SdhB from Reclinomonas americana (gi:11466549) as query. Consensus EST sequences specifying SdhB-n and SdhB-c were translated and the inferred protein sequences were subsequently used to query the non-redundant protein sequence database at NCBI (using BLASTp) along with the non-human, non-mouse EST database (est_others) and TBestDB (using tBLASTn). Database accession numbers are given in additional file 4. The programs TargetP  and MitoProt II  were used to assess the probability of mitochondrial localization for Euglena and trypanosome SdhB-n and SdhB-c. When using TargetP for Euglena proteins, we selected the 'Plant' organism group in order to include the possibility of plastid-targeting, whereas we selected the 'Animal' organism group for trypanosomatids, as the latter do not contain plastids. MitoProt II contains no option for assessing plastid localization. The consensus E. gracilis mTP profile was generated using LogoBar-0.9.12  from a de-gapped alignment of the 30-most N-terminal residues from 107 predicted E. gracilis mitochondrion-targeted proteins.
Conserved domains were identified by searching the Pfam and SMART databases at the SMART server , using E. gracilis SdhB-n and SdhB-c as queries. Protein alignments were constructed using Muscle v3.6  with default parameters and edited with the BioEdit Sequence Alignment Editor. The editing function was used to remove gaps from the non-homologous euglenozoan protein extensions. However, regions corresponding to likely mTPs were left unedited. In the alignment, shading of a given column reflects a minimum of 60% identity.
List of abbreviations
flavin adenine dinucleotide (oxidized form)
flavin adenine dinucleotide (reduced form)
mitochondrial targeting peptide
succinate dehydrogenase (succinate-ubiquinone oxidoreductase)
RMRG was supported by a CGSD from the Natural Sciences and Engineering Research Council (NSERC) and a Predoctoral Scholarship from the Killam Trusts. MWG gratefully acknowledges salary support from the Canada Research Chairs Program as well as operating support from the Canadian Institutes of Health Research (MOP-4124).
- Yankovskaya V, Horsefield R, Tornroth S, Luna-Chavez C, Miyoshi H, Leger C, Byrne B, Cecchini G, Iwata S: Architecture of succinate dehydrogenase and reactive oxygen species generation. Science. 2003, 299: 700-704. 10.1126/science.1079605.View ArticlePubMedGoogle Scholar
- Huang LS, Sun G, Cobessi D, Wang AC, Shen JT, Tung EY, Anderson VE, Berry EA: 3-nitropropionic acid is a suicide inhibitor of mitochondrial respiration that, upon oxidation by complex II, forms a covalent adduct with a catalytic base arginine in the active site of the enzyme. J Biol Chem. 2006, 281: 5965-5972. 10.1074/jbc.M511270200.PubMed CentralView ArticlePubMedGoogle Scholar
- Sun F, Huo X, Zhai Y, Wang A, Xu J, Su D, Bartlam M, Rao Z: Crystal structure of mitochondrial respiratory membrane protein complex II. Cell. 2005, 121: 1043-1057. 10.1016/j.cell.2005.05.025.View ArticlePubMedGoogle Scholar
- Gray MW: Evolution of organellar genomes. Curr Opin Genet Dev. 1999, 9: 678-687. 10.1016/S0959-437X(99)00030-1.View ArticlePubMedGoogle Scholar
- Gibbs SL: The chloroplast of Euglena may have evolved from symbiotic green algae. Can J Bot. 1978, 56: 2883-2889. 10.1139/b78-345.View ArticleGoogle Scholar
- Simpson AGB: The identity and composition of the Euglenozoa. Arch Protistenkd. 1997, 148: 318-328.View ArticleGoogle Scholar
- Moreira D, López-García P, Rodríguez-Valera F: New insights into the phylogenetic position of diplonemids: G+C content bias, differences of evolutionary rate and a new environmental sequence. Int J Syst Evol Microbiol. 2001, 51: 2211-2219.View ArticlePubMedGoogle Scholar
- Simpson AGB, Roger AJ: Protein phylogenies robustly resolve the deep-level relationships within Euglenozoa. Mol Phylogenet Evol. 2004, 30: 201-212. 10.1016/S1055-7903(03)00177-5.View ArticlePubMedGoogle Scholar
- Henze K, Badr A, Wettern M, Cerff R, Martin W: A nuclear gene of eubacterial origin in Euglena gracilis reflects cryptic endosymbioses during protist evolution. Proc Natl Acad Sci USA. 1995, 92: 9122-9126. 10.1073/pnas.92.20.9122.PubMed CentralView ArticlePubMedGoogle Scholar
- Ahmadinejad N, Dagan T, Martin W: Genome history in the symbiotic hybrid Euglena gracilis. Gene. 2007, 402: 35-39. 10.1016/j.gene.2007.07.023.View ArticlePubMedGoogle Scholar
- Simpson AGB, Lukeš J, Roger AJ: The evolutionary history of kinetoplastids and their kinetoplasts. Mol Biol Evol. 2002, 19: 2071-2083.View ArticlePubMedGoogle Scholar
- Yasuhira S, Simpson L: Phylogenetic affinity of mitochondria of Euglena gracilis and kinetoplastids using cytochrome oxidase I and hsp60. J Mol Evol. 1997, 44: 341-347. 10.1007/PL00006152.View ArticlePubMedGoogle Scholar
- Tessier LH, Speck van der H, Gualberto JM, Grienenberger JM: The cox1 gene from Euglena gracilis: a protist mitochondrial gene without introns and genetic code modifications. Curr Genet. 1997, 31: 208-213. 10.1007/s002940050197.View ArticlePubMedGoogle Scholar
- Tessier LH, Keller M, Chan RL, Fournier R, Weil JH, Imbault P: Short leader sequences may be transferred from small RNAs to pre-mature mRNAs by trans-splicing in Euglena. EMBO J. 1991, 10: 2621-2625.PubMed CentralPubMedGoogle Scholar
- Marande W, Burger G: Mitochondrial DNA as a genomic jigsaw puzzle. Science. 2007, 318: 415-10.1126/science.1148033.View ArticlePubMedGoogle Scholar
- Emanuelsson O, Nielsen H, Brunak S, von Heijne G: Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J Mol Biol. 2000, 300: 1005-1016. 10.1006/jmbi.2000.3903.View ArticlePubMedGoogle Scholar
- Claros MG, Vincens P: Computational method to predict mitochondrially imported proteins and their targeting sequences. Eur J Biochem. 1996, 241: 779-786. 10.1111/j.1432-1033.1996.00779.x.View ArticlePubMedGoogle Scholar
- Tasker M, Timms M, Hendriks E, Matthews K: Cytochrome oxidase subunit VI of Trypanosoma bru cei is imported without a cleaved presequence and is developmentally regulated at both RNA and protein levels. Mol Microbiol. 2001, 39: 272-285. 10.1046/j.1365-2958.2001.02252.x.PubMed CentralView ArticlePubMedGoogle Scholar
- Pérez-Martínez X, Antaramian A, Vázquez-Acevedo M, Funes S, Tolkunova E, d'Alayer J, Claros MG, Davidson E, King MP, González-Halphen D: Subunit II of cytochrome c oxidase in Chlamydomonad algae is a heterodimer encoded by two independent nuclear genes. J Biol Chem. 2001, 276: 11302-11309. 10.1074/jbc.M010244200.View ArticlePubMedGoogle Scholar
- Funes S, Davidson E, Reyes-Prieto A, Magallón S, Herion P, King MP, González-Halphen D: A green algal apicoplast ancestor. Science. 2002, 298: 2155-10.1126/science.1076003.View ArticlePubMedGoogle Scholar
- Waller RF, Keeling PJ: Alveolate and chlorophycean mitochondrial cox2 genes split twice independently. Gene. 2006, 383: 33-37. 10.1016/j.gene.2006.07.003.View ArticlePubMedGoogle Scholar
- Cecchini G, Schröder I, Gunsalus RP, Maklashina E: Succinate dehydrogenase and fumarate reductase from Escherichia coli. Biochim Biophys Acta. 2002, 1553: 140-157. 10.1016/S0005-2728(01)00238-9.View ArticlePubMedGoogle Scholar
- Cheng VWT, Ma E, Zhao Z, Rothery RA, Weiner JH: The iron-sulfur clusters in Escherichia coli succinate dehydrogenase direct electron flow. J Biol Chem. 2006, 281: 27662-27668. 10.1074/jbc.M604900200.View ArticlePubMedGoogle Scholar
- Hudson JM, Heffron K, Kotlyar V, Sher Y, Maklashina E, Cecchini G, Armstrong FA: Electron transfer and catalytic control by the iron-sulfur clusters in a respiratory enzyme, E. coli fumarate reductase. J Am Chem Soc. 2005, 127: 6977-6989. 10.1021/ja043404q.View ArticlePubMedGoogle Scholar
- Durnford DG, Gray MW: Analysis of Euglena gracilis plastid-targeted proteins reveals different classes of transit sequences. Eukaryot Cell. 2006, 5: 2079-2091. 10.1128/EC.00222-06.PubMed CentralView ArticlePubMedGoogle Scholar
- Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25: 3389-3402. 10.1093/nar/25.17.3389.PubMed CentralView ArticlePubMedGoogle Scholar
- O'Brien EA, Koski LB, Zhang Y, Yang L, Wang E, Gray MW, Burger G, Lang BF: TBestDB: a taxonomically broad database of expressed sequence tags (ESTs). Nucleic Acids Res. 2007, 35: D445-D451. 10.1093/nar/gkl770.PubMed CentralView ArticlePubMedGoogle Scholar
- Perez-Bercoff A, Koch J, Burglin TR: LogoBar: bar graph visualization of protein logos with gaps. Bioinformatics. 2006, 22: 112-114. 10.1093/bioinformatics/bti761.View ArticlePubMedGoogle Scholar
- Schultz J, Milpetz F, Bork P, Ponting CP: SMART, a simple modular architecture research tool: identification of signaling domains. Proc Natl Acad Sci USA. 1998, 95: 5857-5864. 10.1073/pnas.95.11.5857.PubMed CentralView ArticlePubMedGoogle Scholar
- Edgar RC: MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics. 2004, 5: 113-10.1186/1471-2105-5-113.PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.