Expressed sequence tags from Atta laevigata and identification of candidate genes for the control of pest leaf-cutting ants

Background Leafcutters are the highest evolved within Neotropical ants in the tribe Attini and model systems for studying caste formation, labor division and symbiosis with microorganisms. Some species of leafcutters are agricultural pests controlled by chemicals which affect other animals and accumulate in the environment. Aiming to provide genetic basis for the study of leafcutters and for the development of more specific and environmentally friendly methods for the control of pest leafcutters, we generated expressed sequence tag data from Atta laevigata, one of the pest ants with broad geographic distribution in South America. Results The analysis of the expressed sequence tags allowed us to characterize 2,006 unique sequences in Atta laevigata. Sixteen of these genes had a high number of transcripts and are likely positively selected for high level of gene expression, being responsible for three basic biological functions: energy conservation through redox reactions in mitochondria; cytoskeleton and muscle structuring; regulation of gene expression and metabolism. Based on leafcutters lifestyle and reports of genes involved in key processes of other social insects, we identified 146 sequences potential targets for controlling pest leafcutters. The targets are responsible for antixenobiosis, development and longevity, immunity, resistance to pathogens, pheromone function, cell signaling, behavior, polysaccharide metabolism and arginine kynase activity. Conclusion The generation and analysis of expressed sequence tags from Atta laevigata have provided important genetic basis for future studies on the biology of leaf-cutting ants and may contribute to the development of a more specific and environmentally friendly method for the control of agricultural pest leafcutters.


Background
The tribe Attini comprises over 200 ant species [1] which culture mutualistic fungi for their feeding [2]. The most evolutionary derived attines are the leaf-cutting ants in the genera Atta and Acromyrmex which are considered major herbivores in the tropics [3].
However, despite of these ecological roles, many leafcutter species are considered agricultural pests which impose severe economic damages to agriculture [15,16]. Some of the characteristics contributing to the pest status of leafcutters are their ability of exploiting a great variety of plant species [17], reaching high population density [15] and long life spanning queens constantly laying eggs for up to 15 years [18].
Atta laevigata is a pest leafcutter distinguished by a very large and shiny head in soldiers, a characteristic which has rendered the species with the popular name "cabeça de vidro" (meaning glass head). It can be found in Venezuela, Colombia, Guyana, Bolivia, Paraguay and, in Brazil, from the Amazonian Rain Forest in the North to the Paraná state in the South [19]. It cuts leaves from many plantations, like pine tree [20], cocoa [21] and eucalyptus [22], as well as wide variety of native plants from different biomes such as the Cerrado or the Rain Forest, where its intense herbivory challenges reforestation of degraded areas [23,24].
The control of pest leafcutters in small properties can be done by biological methods [25] or even utilizing the waste material generated by the ants [26], but in extensive monocultures this control utilizes massive amounts of broad spectrum insecticides which are toxic to other animals and persist in the environment [27]. Thus, the development of a more specific and environmentally friendly process for controlling the leafcutters is required [28].
Genomic studies can contribute with that by characterizing genes involved in key functions for the leafcutters, like longevity, fertility and plasticity to exploit different vegetations, raising more specific targets for the ant control. Genomics is also a valuable resource for ecological and evolutionary studies of leaf-cutting ants.
In the present investigation, we carried out a genomic study in the pest leafcutter Atta laevigata by generating 3,203 expressed sequence tags (ESTs) which characterized 2,006 unique sequences (US). We postulate important differences in expression level among the transcripts and identified 146 potential target sequences for the control of pest leaf-cutting ants.

EST generation
Two grams of soldiers and major workers of Atta laevigata were macerated under liquid nitrogen, total RNA was extracted with the TRIzol method (Invitrogen, UK) and mRNA was purified using the PolyATract System (Promega, USA). The CloneMiner cDNA Library Construction Kit (Invitrogen, UK) and 2 μg of mRNA were utilized for the synthesis of first and second cDNA strands which were then size-fractioned in a 1.0 ml Sephacryl S-500 resin column, inserted in a pDONR222 plasmid (Invitrogen, UK) and transformed into DH10B Escherichia coli. Cells were plated onto solid Circle Grow medium (QBIO-GENE, Canada) containing 25 μg. ml -1 kanamicin and individually picked into a permanent culture plate with 96 wells. After 22 hours growth in liquid Circle Grow medium (25 mg.ml -1 kanamicin), plasmid DNA was purified by alkaline lysis [29] and sequenced in reactions containing 300 ng template DNA, 5 pmol M13 forward primer and the DYEnamic ET Dye Terminator kit reactant (GE Healthcare, UK), according to the manufacturer's protocol. The amplified products were resolved in a MEGA-BACE 1000 automated DNA sequence machine (GE Healthcare, UK).

EST analysis
The pipeline generation system EGene [30] was used to clean and assemble ESTs in contigs and singlets. Sequences were filtered by quality using phred values >20 and 90% of minimum identity percent in window. Filtered sequences were then masked against vector and primer sequences, selected by size (>100 bp) and assembled using CAP3 [31] with an overlap percent identity cutoff (p) of 90 and a minimum overlap length cutoff (o) of 50.
The program Blast2GO (B2G) [32] was used to associate every Atta laevigata singlet and contig to blastx [33] results (nr protein database; E-value ≤ 10 -5 ), Gene Ontology (GO) terms [34], InterProScan classification [35,36] including signal peptide [37] and transmembrane regions predictions, Kyoto Encyclopedia of Genes and Genomes (KEGG) maps (http://www.genome.jp/kegg/), and Enzyme Commission (EC) numbers (IUBMB). The results generated by B2G and those obtained from Conserved Domain Database (CCD) were manually inspected, in order to group contigs and singlets in functional categories and to infer transcript abundance in Atta laevigata.

EST generation and assembly
The 5' ends of 4,704 clones from the Atta laevigata cDNA library were sequenced, resulting 4,482 reads. We were able to selected 3,203 of these reads, which presented high-quality and with average length of 418 bp (Table 1; [GenBank:JG659458 to JG662660, dbEST ID:73713535 to 73716737, Genome Project ID:63563]).
The high-quality sequences were assembled in 340 contigs (619 bp average) and 1,666 singlets which we assume to represent 2,006 unique sequences (US). It is likely that some of the US came from the same gene due to non-overlapping ESTs from a single gene or products of alternative splicing [38].

Comparative analysis of Atta laevigata genes
Using Blastx we found that 1,165 (58%) of the characterized Atta laevigata US matched significantly (E-value ≤ 10 -5 ) with GenBank sequences in the non-redundant (nr) database ( Figure 1A). Most of the best hits ( Figure  1B) came from the hymenopterans Apis mellifera [39]  (677) and Nasonia vitripenis [40] (334) genomes, but only 10 hits came from the ants Solenopsis invicta, Lasius niger or Myrmica rubra because ant sequences are relatively poorly represented in the nr database. We used B2G program and found GO terms ( Figure  2) to 865, EC numbers to 250, predicted signal peptides in 229, and domain information for 66 Atta laevigata US, as well as KEGG information. This bulk of retrieved information and data obtained from CDD were manually inspected to annotate Atta laevigata US in 27 functional categories ( Figure 3).
The number of US per category gives us an idea on the diversity of genes existing in each cell function. This diversity was found high within transcripts related with signaling pathways, membrane or regulation of gene expression, but very low within transcripts related to secondary metabolism, cuticular and peritrophic membranes or homeobox.

Variation of the number of reads per contig
The number of reads per contig varied from two to 123, with 73% of the contigs containing two or three reads and only 7% containing 10 or more reads ( Figure 4). Therefore few contigs concentrated many reads, i.e. 1.1% (23 out of 2,006) of the contigs contained 18.8% (603 out of 3,203) of the reads. By dividing the number of reads (3,203) by the number of US (2,006) it was found the average of 1.6 reads per contig. Some of the contigs exceeding this average value are shown in Table  2. Whether the number of reads per contig is related to gene expression level, it can be assumed that Atta laevigata contains a set of 23 highly expressed genes. Sixteen of these genes are involved with three major cellular processes ( Table 2): (i) ATP synthesis coupled to redox reactions in mitochondria (273 reads); (ii) muscle or cytoskeleton structure (135 reads); (iii) transcription regulatory processes through homeobox or signaling proteins (95 reads). Gene expression is energetically expensive and is accompanied by protein synthesis for the translational process which is even more expensive. The increasing of the number of transcripts of a given gene, even in a very small extent, is not a neutral process but rather strongly constrained by evolution [41], and expected to occur only if positively selected. Therefore, our results suggest that high expression levels have been positively selected in Atta laevigata for genes responsible for energy conservation, cell structure and regulation.

Identification of candidate genes for the control of pest leafcutters
Inhibition of the translation of genes which play essential functions in insects by feeding these insects with dsRNA [42] or using transgenic plants [43] seems a promising procedure for the control of agricultural pests [44]. One of the advantages of this procedure is that it targets mRNA molecules which may be species-specific.
In order to control pest leafcutters by inhibiting gene translation, one needs to identify and sequence target candidate genes. Our library was found to contain 146 US which represent potential target genes for the control of leafcutters, because these US are likely playing essential functions in Atta laevigata (Table 3). These target genes are related to antixenobiosis (including insecticide detoxification), queen longevity, larval development, insect immunity or resistance to pathogens, communication necessary to social tasks, polysaccharide metabolism or insecticide action.
The function and potential utilization of these 146 US as targets for the control of pest leafcutters are considered below.
The enzyme glutathione S-transferase catalyzes the initial conjugation of insecticides with glutathione. Both enzyme and glutathione are very abundant in the cells and essential for detoxification of electrophiles causing cytotoxic or genotoxic damage [47]. The enzyme may play a role in insecticide resistance [48], herbicide resistance in plants [49], resistance of cancer cells to chemotherapeutic agents [50], and antibiotic resistance in bacteria [51]. In our study we found 25 US in the cytochrome P450 family and 12 US probably related with detoxification of xenobiotics, including glutathione Stransferase, glutamate cysteine ligase and aldehyde oxidase (Table 3). All these genes may be important targets for the control of leafcutters.

Development and longevity genes
Of the 18 US we found (Table 3) involved with development, growth and differentiation, four are putatively related with nervous system development, two of which contained the immunoglobulin domain: one wrapper one lachesin homolog. The protein lachesin has a role in early neuronal differentiation as well in axon outgrowth, cell recognition events, cell adhesion or intercellular communication [52]. The other 14 US in this category (Table 3) may be involved in different phases of insect development like egg, or larvae, or development of tissues or organs like mesoderma, spermatechae and antennae.
Queen and worker ants develop from identical eggs, being genetically identical, but the caste system produces a long-lived queen and a short-lived worker with up to ten-fold lifespan differences [3]. Harman [53] stated that lifespan is determined by the rate at which oxidative damage occurs due to the accumulation of by-products of oxidative energy metabolism. Harman's theory implicates that long-lived organisms produce fewer reactive oxygen species or have increased antioxidant production [54], although the degree of lifespan extension can be sex-or genotypespecific [55] and sometimes poorly correlated with antioxidant levels [56].  We found 13 US likely involved in organism lifespan by protection from oxidative stress (Table 3) and which are directly involved in the degradation of superoxide radicals and hydrogen peroxide or neutralization of reactive oxygen species, such as the putative Cu/Zn superoxide dismutase, catalase, Rpd3 histone deacetylase, peroxiredoxin 5, thioredoxin reductase and phospholipid hydroperoxide glutathione peroxidase.
Our library contained four US putatively coding for juvenile hormone binding protein (JHBP) domain and two for putative proteins that participate in JHBP biosynthesis (Table 3). Juvenile hormones (JH) regulate a great number of physiological processes in insect development. Larvae requires JH to maintain larval state and JH must be absent in the last larval instar for metamorphosis to start [57,58]. They are also necessary for reproduction in adults [59].  The characterization of genes which are related to development and longevity in Atta laevigata allows future investigation on the effect of the expression of these genes on queen maturation and lifespan, which are a key features associated with leafcutter pest ability.

Genes associated with immunity and resistance to pathogens
Pathogens, parasites or injury triggers in insects innate immune responses that are in essence similar and comprise both cellular and humoral components. Cellular mechanisms include phagocytosis by special blood cells and encapsulation of large invaders [60]. Humoral responses involve events of proteolytic cascades leading to melanization [60] and the production of antimicrobial peptides initiated via two distinct signaling pathways, Toll and Immune Deficiency, which depend on the pathogen recognition [61]. There are two types of recognition proteins: peptidoglycan recognition proteins and Gram-negative bacteria-binding proteins.   We found 37 US that may be involved with immunity or pathogen resistance (Table 3), including the putative toll like interacting protein, prophenoloxidase subunit 3 and easter CG4920-PA, the last two with role in melanin synthesis. We also found sequences putatively coding for the antimicrobial peptides hymenoptaecin and defensin 2, and for the peptidoglycan recognition protein precursor, as well as transferrin and transferrin 2 which participate in response to microbial infection by sequestering iron that is an essential nutrient for some pathogens [62].
Leaf-cutting ants and their mutualistic fungus are constantly challenged by pathogenic microorganisms [63] which ultimately regulate host population [64]. Therefore, the 37 US we found probably involved in resistance to microbial pathogens are important markers for understanding antimicrobial mechanisms in leafcutters and putative targets for controlling pest leafcutters.

Communication genes
Communication plays a central part in social insects necessary for division of labor and task partitioning which are essential for harvesting food, nursing the broods and sexual reproduction [65]. Thus, targeting genes involved in communication seems a promising strategy for the control of leaf-cutting ants.
Our library contained 11 US probably related to communication, one of them putatively coding for the pheromone binding protein (PBP), which is important for chemical recognition of insect conspecifics by transporting odorant molecules from cuticular pores to receptors [66]. In Solenopsis invicta, the gene Gp-9, which is a PBP homolog, seems to have a role in worker ability to discriminate queens and regulate their numbers [67]. Other important communication gene found putatively codes for fatty acid binding protein involved in transport of communication molecules in insects [68].
Four of the communication US we found were in the lipocalin family which is composed of secreted proteins binding small hydrophobic molecules or forming macromolecular complexes associated with cell surface receptors important for transport, pheromone signaling and olfaction [69]. These sequences putatively code for the odorant binding proteins, apolipophorin III or PP238.
We also found three homologs to the chemosensory protein from Nasonia vitripennis, chemosensory protein 2 from Apis mellifera and chemosensory protein 5 from Bombyx mori. Chemosensory proteins may be specifically expressed in sensory organs which are important in ant behavior [70] and participate in cellular processes that require lipophilic compounds [71].
The putative genes gustatory receptor and dihidrooratate dehydrogenase involved in odorant reception in insects were also found.

Signaling genes
Tetraspanin is an important signaling membrane protein expressed in antennae of moths and honeybees [72], being a molecular facilitator of signal transduction and cell adhesion [73]. In our library, six US putatively coding for tetraspanin were present.
We also found two US corresponding to nicotinic acetylcholine receptor which plays a role in visual processing, learning and memory, olfactory signal processing, and mechanosensory antennal input in honeybee [74]. These receptors are targets of neonicotinoids insecticides used against piercing-sucking pests [75].

Behavior genes
Eleven Atta laevigata US in this category (Table 3) were homolog to genes involved in behavior, learning, memory and courtship in Apis mellifera, Drosophila melanogaster or Solenopsis invicta. Some of the genes controlling social behavior and complex tasks or abilities may be specific to Hymenoptera [38] and thus may be specific targets for the control of pest leafcutters.

Polysaccharide metabolism genes
Food sources for worker leafcutters relies mostly on the plant polysaccharides cellulose, xylane and starch, which are degraded by extracellular enzymes secreted by the mutualistic fungus [76], generating mono and disaccharides readily assimilated by the ants [77]. Degradation of cellulose by the mutualistic fungus generates cellobiose [10] and degradation of starch generates maltose, both disaccharides being consumed by leafcutters [77] through the production of alpha-and beta-glucosidase, respectively. In addition, workers assimilate starch at certain extent [77], which demands production of alphaamylase.
Our library contained 59 US (Figure 3) corresponding to genes related to carbohydrate metabolism, including alpha-glucosidase-like, beta-glucosidase and alpha-amylase (Table 3) which are promising targets for leafcutters control.

Arginine kinase gene
Arginine kinase catalyses the reversible transfer of phosphate between ATP and guanidine substrates and acts in cells that need readily available energy sources [78]. This enzyme activity in cockroaches was found to be inhibited by nitrates and borates [79] which were then used as insecticides. Our library contained two US which are putative arginine kinase genes ( Table 3) that may also be important for the control of leafcutters.

Future perspectives
The 146 US here proposed as targets for the control of leaf-cutting ants can be used for primer designing in order to study gene expression through real time PCR. For instance, over-expression of sequences here proposed as related to immunity or antixenobiosis in A. laevigata challenged by pathogens or insecticides should validate the protective role of the respective gene products in leafcutters exposed to adverse conditions, helping us to understand the molecular basis of pest ant resistance to hazardous chemicals. A future scenario can be envisaged in which inhibition of gene expression, gene translation or the related protein activities would make pest leafcutters more susceptible to pathogens, insecticides or anti-herbivory chemicals produced by crops. In summary, inhibition of genes or gene products related to the processes described in Table 3 may specifically hamper the colonization of crop areas by pest leafcutters.

Conclusion
Leaf-cutting ants are the major neotropical herbivores, many of which are important agricultural pests. We characterized 2,006 unique sequences (US) in Atta laevigata, one of the most geographically spread pest leafcutting ant in South America, and found that 16 of the genes are likely under positively selected high expression and responsible for energy conservation or cell structuring or regulation. Another set of 146 US which play important part in anti-xenobiosis, longevity, immunity, development, communication, nutrition or insecticide action were identified as putative targets for the control of pest leafcutters. Our findings provided genetic background for basic and applied studies on these ants.