TIR-NBS-LRR genes are rare in monocots: evidence from diverse monocot orders

Background Plant resistance (R) gene products recognize pathogen effector molecules. Many R genes code for proteins containing nucleotide binding site (NBS) and C-terminal leucine-rich repeat (LRR) domains. NBS-LRR proteins can be divided into two groups, TIR-NBS-LRR and non-TIR-NBS-LRR, based on the structure of the N-terminal domain. Although both classes are clearly present in gymnosperms and eudicots, only non-TIR sequences have been found consistently in monocots. Since most studies in monocots have been limited to agriculturally important grasses, it is difficult to draw conclusions. The purpose of our study was to look for evidence of these sequences in additional monocot orders. Findings Using degenerate PCR, we amplified NBS sequences from four monocot species (C. blanda, D. marginata, S. trifasciata, and Spathiphyllum sp.), a gymnosperm (C. revoluta) and a eudicot (C. canephora). We successfully amplified TIR-NBS-LRR sequences from dicot and gymnosperm DNA, but not from monocot DNA. Using databases, we obtained NBS sequences from additional monocots, magnoliids and basal angiosperms. TIR-type sequences were not present in monocot or magnoliid sequences, but were present in the basal angiosperms. Phylogenetic analysis supported a single TIR clade and multiple non-TIR clades. Conclusion We were unable to find monocot TIR-NBS-LRR sequences by PCR amplification or database searches. In contrast to previous studies, our results represent five monocot orders (Poales, Zingiberales, Arecales, Asparagales, and Alismatales). Our results establish the presence of TIR-NBS-LRR sequences in basal angiosperms and suggest that although these sequences were present in early land plants, they have been reduced significantly in monocots and magnoliids.


Background
Plants recognize pathogens using both non-specific and specific mechanisms. Pattern recognition receptors (PRRs) mediate non-specific recognition by interacting with microbe-or pathogen-associated molecular patterns (MAMPs or PAMPs), while the products of plant resistance (R) genes recognize specific pathogen molecules [1,2]. Disease resistance is the only known function for R genes, which appear to have a gene-for-gene relationship with pathogen avirulence (avr) genes [3].
Many R genes code for proteins containing nucleotide binding site (NBS) and C-terminal leucine-rich repeat (LRR) domains. The NBS domain of plant R genes (also called the NB-ARC domain) shares homology with human APAF-1 and C. elegans CED-4, proteins involved in regulating cell death [4]. NBS-LRR proteins can be divided into two groups, TIR-NBS-LRR and non-TIR-NBS-LRR, based on the structure of the N-terminal domain ( Figure  1) [5,6].
The NBS domain from R genes is relatively conserved and contains type-specific motifs ( Table 1). The final residue of the kinase-2 motif is especially useful for classifying a sequence as TIR or non-TIR [7]. TIR-type NBS sequences are relatively homogeneous and form a single clade, while non-TIR sequences form multiple clades that likely originated before the split between angiosperms and gymnosperms [8,9].
Studies of NBS-LRR sequences in monocots have been limited to agriculturally important species in the grass family (Poaceae). Recent studies from Zingiber and Musa species (order Zingiberales) reported only non-TIR type sequences [15][16][17][18]. Since there are ten orders of monocots [19], we are limited in our ability to make generalizations based on information from only two orders. To further investigate the presence of TIR-NBS-LRR sequences in monocots, we combined PCR and bioinformatics to obtain data from additional monocots as well as magnoliids and basal angiosperms ( Figure 2).

Results
We amplified sequences from four monocot species representing three monocot orders ( Figure 2): Draceana marginata and Sansevieria trifasciata (Asparagales), Spathiphyllum sp. (Alismatales), and Carex blanda (Poales). For comparison, we included a gymnosperm (Cycas revoluta) and a dicot (Coffea canephora). We obtained sequences from a total of 60 PCR products that resulted in 24 unique NBS sequences (Table 2). We found non-TIR type sequences in all plants tested except the cycad, but only two unique TIR-type NBS sequences, one each from C. revoluta and C. canephora.
Using Pfam [20] and GenBank [21], we retrieved 17 monocot sequences (ten from Musa acuminata, four from Elaeis guineensis, and three from Zingiber species), all of which we classified as non-TIR-NBS-LRR sequences based on the kinase-2 motif. Although we did not find any new TIR-type sequences from monocots, the search confirmed the similarity of the Triticum-Thinopyrum sequences [14].
In addition to monocot sequences, we retrieved two sequences from Persea americana (magnoliid) and seven sequences from basal angiosperms (five from Nuphar advena and two from Amborella trichopoda). Based on the kinase-2 motif, both P. americana sequences were non-TIR and all five N. advena sequences were TIR-type sequences. The A. trichopoda sequences have a glutamic acid in the diagnostic position, but downstream motifs similar to TIR-type NBS sequences (data not shown). For comparison, we also retrieved 37 Pinus (gymnosperm) and six Physcomitrella patens (bryophyte) sequences.
We eliminated redundant sequences within a species (>70% identity), resulting in an analysis of 53 plant sequences (Table 3 and Additional Files 1, 2 &3) and human APAF-1 as an outgroup sequence. As much of the NBS domain as was available for each sequence was used for phylogenetic analysis using parsimony criteria ( Figure  3). Based on the scaffold tree (see Methods), we compared our clades with those previously reported [8]. All fifteen sequences that we identified as TIR-type NBS sequences based on consensus motifs formed a single clade that was Two types of plant NBS-LRR proteins Figure 1 Two types of plant NBS-LRR proteins. The two classes of NBS-LRR protein are differentiated by the N-terminal domain. TIR-NBS-LRR proteins have a Toll-interleukin-like receptor (TIR) domain, based on homology to the Drosophila Toll and mammalian Interleukin-1 (IL-1) receptors. The Nterminal region of non-TIR-NBS-LRR proteins is less defined, but often contains a coiled-coil (CC) domain. In R genes, the NBS domain plays a role in intramolecular interactions with the LRR and N-terminal domains [28]. The N-terminal domain influences the signaling pathway that will be activated upon effector recognition [29], and may also be involved in pathogen recognition and interactions with targets of pathogen effectors [30].
well-supported by bootstrap analysis (85%). The non-TIR-type NBS sequences formed several clades, but many were not well-supported. The well-supported non-TIR clades correspond to non-TIR clades 3 and 4 in Cannon's analysis [8], while non-TIR clades 1 and 2 are more ambiguous.

Discussion
Previous studies of plant NBS-LRR sequences have suggested that only non-TIR-NBS-LRR sequences are present in monocots [7][8][9]13], and the sequences from the Triticum-Thinopyrum addition line [14] have not been mentioned in later studies [17,18,22,23]. A study in Agrostis species [22] reported two TIR-NBS-LRR sequences (Genbank: EE284250, EE284257). However, when these cDNA sequences are translated, they do not contain an open reading frame consistent with a NBS domain (data not shown), so it is unclear that these represent monocot TIR-NBS-LRR genes.

Our PCR strategy amplified TIR-NBS-LRR sequences from dicot and gymnosperm DNA, but not from monocot DNA
In spite of attempts to bias amplification and cloning toward TIR-type NBS sequences, we did not find TIR-NBS-LRR sequences in any of the monocots we tested, although we easily cloned and sequenced TIR-type sequences from a gymnosperm, eudicot, and Arabidopsis control reactions. Although we expected to find the TIR class in dicots, a previous study in coffee did not report any TIR-type sequences [24].
Our results support the hypothesis that TIR-type NBS sequences are rare in monocots. We used diverse monocot taxa, including a species closely related to grasses (C. blanda) and a species from a basal monocot order (Spathiphyllum sp.). As with any PCR study, we cannot eliminate the possibility that TIR-NBS-LRR sequences in monocots are too divergent for our primers to amplify. Species-specific amplification has been reported [15] and more comparative work is needed to confirm that there are definitive consensus sequences for these motifs that are well-conserved across diverse taxa. Consensus motifs are those reported by Meyers [7]. The final position of the kinase-2 domain that is used for classification is bolded and underlined.
Taxa included in this study Figure 2 Taxa included in this study. The tree shows the ten orders and one family that form the monocots [19]. The broad relationships between the monocots and other land plants are shown. Groups marked with an asterisk (*) show where TIR-type NBS sequences have been confirmed. The status of TIR-type NBS sequences in Poales is unclear (*?) since these sequences are generally considered absent from Poales, but have been found in one study [14]. Monocot orders in green correspond to NBS sequences obtained in this study by degenerate PCR while those in blue show where sequences in this study were obtained from databases. TIR-type NBS sequences found or not found in this study: + or -

Database searches and phylogenetic analysis show that TIR-type NBS sequences are present in basal angiosperms, but are rare in monocots and magnoliids
Using Pfam and GenBank, we obtained NBS sequences from monocots, a magnoliid, basal angiosperms, gymnosperms, and a bryophyte. The pine and moss sequences had been classified previously [10][11][12] and provided diverse lineages for comparison to the predominantly monocot and dicot sequences in our study. Based on the kinase-2 motif, TIR-type sequences were absent from monocots and magnoliids (with the exception of the reported Triticum-Thinopyrum sequences), but were present in basal angiosperms, gymnosperms, and bryophytes ( Figure 2).
Our phylogenetic analysis (Figure 3) was consistent with previous analyses that showed a single TIR clade and multiple non-TIR clades [8,9]. The full NBS domain was not available for some sequences used in the analysis. We expect that the phylogenetic relationships will be clarified as more sequence becomes available. Our clear non-TIR clades corresponded to N3 and N4 (Cannon), with N1 and N2 split into several poorly-supported clades. Cannon's N4 clade did not include monocots, while our analysis placed a Z. officinale sequence in this clade (Figure 3). Cannon reported that N1.2 might be monocot specific, but our corresponding clade was not well-supported. Based on the current analysis, N3.2 may be monocot specific.
We expected to find both dicots and gymnosperms represented across TIR and non-TIR clades, but gymnosperm sequences were only found in the TIR and N4 clades. Both magnoliid sequences were non-TIR type sequences, and all basal angiosperm sequences were in the TIR clade (Figure 3). As more basal angiosperm sequences become For each species tested, the table shows the number of fragments successfully cloned and sequenced for each type of primer set, the number of these fragments that produced TIR and non-TIR sequences, and the number of unique NBS sequences found. Based on previous work, we expected PCR products of approximately 700-900 base pairs [8,9]. In general, we cloned fragments of approximately 600-1000 bp, but we also cloned some fragments as small as 300 bp and as large as 1.5 kb to allow for the possibility that the NBS domain of TIR-type sequences in monocots differs significantly from those observed previously. At least five fragments smaller than expected and five larger than expected were cloned and sequenced, none of which contained identifiable NBS sequence. We cloned a total of 62 fragments and successfully obtained 105 sequences from 60 of those fragments. The BLASTP algorithm was used to compare the translations to the Genbank non-redundant database. A conserved domain search identified 30 sequences from 19 fragments that showed homology to an NB-ARC domain. We excluded six sequences that did not contain an open reading frame, were redundant, or were fragments identical to longer sequences, resulting in a total of 24 unique sequences. * A. thaliana was used only to confirm that TIR-specific primers would amplify TIR-type NBS sequences. No non-TIR fragments were cloned from A. thaliana. The table only shows A. thaliana sequences obtained that included an open-reading frame from 5' to 3' primer. Additional sequences obtained that included introns were not included in the table. Amborella trichopoda 1

Zingiber officinale 2
Total 53 10 Number of sequences from each plant species used in phylogenetic analysis. The number from each species used in the Pfam seed sequence is shown for comparison. Figure 3 Phylogenetic tree. We performed a phylogenetic analysis of representative NBS sequences using parsimony criteria (heuristic searches, parsimony default parameters with 100 random sequence additions). The species of each sequence is shown with a letter designation (if more than one sequence from the species was used) and whether sequence analysis shows TIR (TIR+) or non-TIR (TIR-) sequence motifs. Monocot sequences are shown in red, eudicot sequences are shown in purple, magnoliid sequences are shown in blue, basal angiosperm sequences are shown in orange, gymnosperm sequences are shown in green, the bryophyte sequence is shown in brown, and the outgroup human sequence is shown in black. Bars on the right show a classification of NBS sequences modified from groups reported previously [8]. Numbers shown are from bootstrap analysis (1000 replicates) using parsimony criteria. Only values over 70 are shown.

Phylogenetic tree
available, we expect to also find non-TIR-type NBS sequences. Our results suggest that TIR-type sequences are rare in both magnoliids and monocots.

TIR-NBS-LRR vs. the TIR domain
The rarity of TIR-NBS-LRR sequences in monocots does not necessarily reflect on the abundance of the TIR domain itself. Two similar protein families that may act as adapters (TIR-NBS and TIR-X) are found in monocots, but in low numbers compared to dicots and gymnosperms [25]. Many sequences in the databases are fragments, and some predicted non-TIR-NBS-LRR proteins may be members of these new families [25].
Classification of NBS-LRR proteins into TIR and non-TIR ( Figure 1) is based on consensus motifs within the NBS domain (Table 1). Although these are assumed to be diagnostic [8], the presence or absence of a TIR domain has not usually been confirmed. We cannot eliminate the possibility that these motifs are not diagnostic. Further sequencing of the N-terminal regions of these genes is needed to confirm that our categorization is correct.

Conclusion
We were unable to find monocot TIR-NBS-LRR sequences by PCR amplification or database searches. In contrast to previous studies, our results represent five monocot orders (Poales, Zingiberales, Arecales, Asparagales, and Alismatales) as well as basal angiosperms and magnoliids.
Establishing the presence of TIR-type NBS sequences in basal angiosperms fills a gap in our knowledge of these important genes. Our results suggest that although TIRtype NBS-LRR sequences were present in early land plants, they have been reduced significantly in monocots. The sequences from the Triticum-Thinopyrum line [14] remain the only reported monocot TIR-NBS-LRR sequences. We do not know when these sequences were lost, but the P. americana sequences suggest that TIR-NBS-LRR sequences are rare in magnoliids as well. It is not clear whether these sequences were lost independently in both lineages or prior to their divergence. Further sequencing from additional taxa and confirmation that the motifs in the NBS region are diagnostic will be helpful in clarifying the evolutionary history of plant R genes.  Primers are based on previously reported consensus amino acid sequences [7], with the exception of the primer to the P-loop sequence GIGKTT, which was reported as the P-loop consensus sequence for the TIR-NBS-LRR group [9].

Plants
We vegetatively propagated Carex blanda individuals collected from Kansas [26] and grew Arabidopsis thaliana from seeds (Ruth Shaw, University of Minnesota) in the greenhouse. Draceana marginata, Sansevieria trifasciata, Spathiphyllum sp., Cycas revoluta, and Coffea canephora came from the University of Kansas greenhouse.

Genomic DNA isolation
We isolated genomic DNA from frozen plant material by grinding with a mortar and pestle in extraction buffer (0.2 M Tris-HCl pH 7.5, 0.25 M NaCl, 25 mM EDTA pH 8.0, 0.5% SDS) and incubating at 65°C for 10 minutes. We performed three extractions with phenol:chloroform:isoamyl alcohol (25:24:1); the final extraction used a phaselock tube (Eppendorf). We precipitated the DNA and resuspended it in 100 μL10 mM Tris pH 8.0 with 10 μg RNaseA.

Cloning and sequencing
We purified PCR products from agarose gels using the QIAquick gel extraction kit (QIAgen), ligated into the pCR4-TOPO vector (Invitrogen), and transformed into maximum efficiency DH5α competent cells. We isolated plasmid DNA from overnight cultures using a standard alkaline lysis protocol, digested with EcoRI, and sequenced representative clones with standard primers (T3/T7) at the University of Kansas sequencing facility or ACGT, Inc. We typically sequenced between one and three clones from each fragment. Sequences were submitted to GenBank with accession numbers EF687860-EF687864 and EF687876-EF687894.

Database searches
We retrieved monocot (excluding Poales), gymnosperm, and bryophyte sequences from Pfam by viewing the NB-ARC domain (PF00931) species distribution. We retrieved sequences from GenBank by searching with known plant NBS sequences [see Additional File 2] and using the taxonomy reports to identify non-dicot, non-grass plant sequences with significant similarity (e < 0.05). We used EditSeq (Lasergene) for all sequence viewing and editing, and excluded sequences with internal stops or that lacked the diagnostic kinase-2 region.

Phylogenetic analysis
We aligned sequences using the ClustalW algorithm in MegAlign (Lasergene) with manual adjustments as necessary. The alignment contained a core region of approximately 170 amino acids. We generated a scaffold tree based on the Pfam seed alignment and previous analyses [see Additional File 3] [8]. Phylogenetic analysis with parsimony criteria was performed using PAUP* 4.0 beta10 [27]. Figure 4 Primer design. The diagram shows the NBS domain motifs used in primer design. The motifs shown in blue are relatively conserved between TIR and non-TIR classes of NBS sequence while the domains in yellow have consistent differences. The three types of primer sets are shown with arrows to mark the location of the primers used. We used a total of 24 primer combinations that would specifically amplify TIR-NBS and non-TIR-NBS sequences, as well as combinations that would amplify all NBS sequences. All combinations were designed to amplify the kinase-2 region containing either a tryptophan (non-TIR) or aspartic acid (TIR) to aid in classification of the sequence.