- Short Report
The influenza A virus NS genome segment displays lineage-specific patterns in predicted RNA secondary structure
BMC Research Notesvolume 9, Article number: 279 (2016)
Influenza A virus (IAV) is a segmented negative-sense RNA virus that causes seasonal epidemics and periodic pandemics in humans. Two regions (nucleotide positions 82–148 and 497–564) in the positive-sense RNA of the NS segment fold into a multi-branch loop or hairpin structures.
We studied 25,384 NS segment positive-sense RNA unique sequences of human and non-human IAVs in order to predict secondary RNA structures of the 82–148 and 497–564 regions using RNAfold software, and determined their host- and lineage-specific distributions. Hairpins prevailed in avian and avian-origin human IAVs, including H1N1pdm1918 and H5N1. In human and swine IAV hairpins distribution varied between evolutionary lineages.
These results suggest a possible functional role for these RNA secondary structures and the need for experimental evaluation of these structures in the influenza life cycle.
Influenza A virus (IAV) is an important pathogen responsible for annual seasonal epidemics and periodic pandemics in humans. The IAV genome consists of eight negative-sense RNA segments, encoding up to 18 proteins . Pathogenicity, host adaptation, and transmissibility of IAVs are complex multifactorial processes involving interactions between virus and host and are likely to be under independent selective pressures . Many IAV proteins, particularly HA, PB2, PB1-F2, PA-X, NA and NS1, contribute to viral virulence ; however the role of viral RNAs, and especially their secondary structures, cannot be excluded.
The NS segment positive-sense RNA, encoding NS1 and NEP proteins, has been the most extensively studied with regard to secondary structure [4–8]. There are at least two regions with conserved secondary structures. Both are located near the 5′ and 3′ splice sites of the NS gene, at nucleotide positions 82–148 and 497–564 nucleotide positions [7–9]. The 82–148 region is located within the NS1 open reading frame, while the 497–564 region is located within the NS1 and NEP open reading frames. Secondary structures have been recognized to differ between IAV strains. For example, the 497–564 region is in equilibrium between a hairpin and a pseudoknot structure, and the hairpin is stabilized in H5N1 viruses isolated after 2001 as a result of a silent G → C mutation . The experimentally-derived explorations of NS genomic segment secondary structure showed also the presence of stem loop structures in both regions with a high probability (more than 95 % for 82–148) , but most part of base pairs in these stems were evaluated as non-canonical. Generally, however, these structural patterns and their roles in influenza pathogenesis are still to be determined. Since evolution of the NS gene is strongly associated with host adaptation [11, 12], it is intriguing to compare the prevalence of the predicted secondary structures among IAVs of a different origin. Using computer methods, we predicted the secondary RNA structures of the 82–148 and 497–564 regions in the positive-sense RNA of segment 8 in human and non-human IAVs and identified host- and subtype-specific patterns (Additional files 1, 2).
Nucleotide sequences of the non-human NS gene were downloaded from the NCBI Influenza Virus Resource  NCBI and sequences of the human NS gene from Influenza Research Database-IRD  in different periods of time. All duplicated sequences were removed and all data was aligned using MAFFT software . The phylogenetic analysis was made with the help of online service “Generate Phylogenetic Tree” on FLUDB.org using RAxML algorithm and bootstrap analyses with number of replicates = 500. The secondary RNA structures of the 82–148 and 497–564 regions were predicted using RNAfold . RNAfold does not include pseudoknot prediction; therefore, only “stem-loops” (hairpin) and “non loops” structures were evaluated and could be easy visually divided into two groups, according to review Svoboda et al. . Two types of RNA secondary structures were observed. A hairpin structure was defined by the following parameters: stem length, >16 base pairs (bp); loop length, >6 nucleotides (nt); number of branches, 0, ∆G < −19 kcal/mol. A multi-branch stem-loop structure was defined by the following parameters: stem length, >14 bp; number of branches, <2; branch length, <5 bp. Otherwise, the structure with very short stem that divides into two or more smaller stem-loops was referred to as “melt” or “non-looped structure”. Statistical data mining was performed using non-parametric statistical methods in Statistika 6.0 software.
Results and discussion
In our analysis of the NS segment positive-sense RNA secondary structure in 25,384 isolates including 12,192 human, 2794 swine and 10,398 other non-human isolates, we observed host-specific patterns (Table 1). In human and swine IAVs, stem-loop structures were predicted in the 82–148 region of 3/5 (60.1 %) of the isolates, and in the 497–564 region of 1/2 (52 %) of the isolates. In avian and other host IAVs, secondary structure was predicted for the majority of strains in both regions. Equine viruses were distinct, with a hairpin predicted in the 497–564 region for only 8 % of strains. Ignoring host dividing hairpin structures were predicted for 35 % of isolates for the first region and 66 % of isolates for the second.
To study the distribution of hairpin structures within human IAVs, it was necessary to cluster IAV strains according to the evolutionary origin of the NS segment. Swine IAVs were also included to the analysis, since they are closely related to human IAVs. The phylogenetic tree of the NS gene of human and swine IAVs was reconstructed using RAxML . The “Spanish flu” A/H1N1pdm1918 virus was used as an outgroup, since it originated from a pool of avian IAVs and is the ancestor of the majority of contemporary human IAVs . The topology of the phylogenetic tree (Additional file 3) agreed with the accepted model of NS gene evolution [2, 20]. The NS gene sequences of human and swine IAVs formed two distinct groups. The first group corresponded to human and swine IAVs originating from the A/H1N1pdm1918 virus. It consisted of (a) human H1N1 (clade NSH1N1), H2N2 (clade NSH2N2), and H3N2 (clade NSH3N2) IAVs, and (b) classical (clade NSsw_clas) and triple reassortant (clade NSsw_tri) swine IAV lineages and human H1N1pdm09 (clade NSH1N1pdm09) viruses. The second group corresponded to avian-origin human and swine IAVs, including the Eurasian swine lineage (clade NSsw_eur). In these clades some exceptions were observed. For example, several strains of human H1N1 virus, isolated in 1976, were located in the NSsw_clas clade, which can be explained by their swine origin. Sporadic cases of this swine H1N1 virus transmission took place in Fort Dix, New Jersey, USA, in 1976 .
Prediction of the secondary structure in the 82–148 and 497–564 regions was determined for viruses in each clade. Data, presented in Table 1, showed significant differences in the secondary structures in human IAVs of different subtypes and in swine IAVs of different origins. Hairpins were predicted in the 82–148 and 497–564 regions for the majority of isolates in the NSH1N1 clade, while they were not predicted for NSH3N2 clade IAVs. In the NSH2N2 clade, hairpins were predicted only in the 82–148 region. In the NSsw_clas and NSH1N1pdm09 clades, hairpins were predicted only in the 497–564 region for the majority of isolates. However, there were multi-branch stem-loop structures in the 82–148 region in the majority of NSH1N1pdm09 and half of the NSsw_clas viruses. For the NSsw_tri clade, multi-branch stem-loop structures were predicted in the 82–148 region for 70 % of strains; however, in the 497–564 region, hairpin structures were predicted in a minority of isolates.
In avian-origin human and swine IAVs, such as H5N1, H7N7 and H7N9, the secondary structure distribution matches that of avian IAVs. For human H5N1 IAVs, hairpins were predicted in the 82–148 region in a majority of isolates, while in the 497–56 region, hairpins were predicted for viruses isolated after 2001, but not those isolated in 1997–1998. In human H7N9 IAVs, hairpins were predicted in the 82–148 but not 497–564 region. In the NSsw_eur clade, hairpins were predicted in the 497–564 region in the majority of IAV isolates, and in half of IAV isolates for the 82–148 region.
We evaluated the distribution of the observed hairpin secondary structures in different time periods (Figs. 1, 2), and found the key nt substitutions influenced these structures. Pandemic H1N11918 virus was of avian origin and its NS segment positive-sense RNA contained stable hairpins in both the 82–148 and 497–564 regions. By about 1920, seasonally endemic IAVs began to circulate and drift antigenically for nearly 40 years, until 1957. In this period, the hairpin structure prevailed in the 82–148 region; however, sporadic mutations led to the formation of multi-brunch stem-loop structures or elimination of the hairpin. In contrast with the ancestral A/H1N11918 virus, NSH1N1 IAVs isolated after 1940 had a single major nt substitution, A132G, which had no effect on the hairpin structure, and two main nonsynonymous substitutions, G511A (N → D in NS1, no effect in NEP) and G532A (V → I in NS1, T → I in NEP), which led to the elimination of the hairpin structure in the 497–564 region.
In 1957, H1N1 IAVs (clade NSH1N1) were replaced in circulation by H2N2 viruses (clade NSH2N2) as a result of reassortment; however, the NS gene was not changed. There was only one significant mutation (A512U) in the NSH2N2 clade IAVs: it was in the same codon as the G511A mutation and led to amino acid substitutions of D to I in NS1 and M to L in NEP. However, it had no influence on the RNA secondary structure of the 497–564 region. In 1968, H3N2 viruses (NSH3N2 clade) replaced H2N2 viruses. In 1974, the nonsynonymous mutation A122G (K → R) appeared in isolates from the NSH3N2 clade. This mutation was finally fixed in the IAV population in 1977, leading to the loss of the hairpin in the 82–148 region. Interestingly, this was the year that H1N1 viruses began circulation again in the human population. All nucleotide substitutions after 1977 had no influence on the secondary structures of viruses in the NSH3N2 clade.
The secondary structure of the 82–148 region of H1N1 viruses remained unchanged since 1977, when the virus returned to circulation in the human population. In 1983, however, two substitutions, C96U and G131A (R → K), appeared, leading to the loss of the hairpin secondary structure. However, within 2 years, the reverse U96C substitution occurred, restoring the hairpin. Another synonymous C126U mutation occurred in 1996, but it did not influence RNA secondary structure. In 1999, a U84C mutation appeared, once again destabilizing the secondary structure; however, it was compensated by A93C so that the hairpin was preserved. At all later times, this hairpin structure has generally been conserved. All of these mutations in H1N1 viruses, with the exception of G131A, were synonymous. In the 497–564 region of H1N1 viruses, circulating after 1977, an A549G mutation (synonymous in NS1, E → G in NEP) appeared in 1983, which led to the re-emergence of the hairpin.
Thus, hairpins from the ancestral H1N11918 were conserved in human H1N1 and H2N2 viruses in the 82–148 region of the NS segment. The hairpin was passed to H3N2 viruses through reassortment of the NS gene, but it was eliminated there. However, hairpins remained in H1N1 viruses re-emerged in 1977. The hairpin was absent in the 497–564 region in human H2N2 and H3N2 viruses. It was also absent in human H1N1 viruses between 1940 and 1982, but it reemerged in 1982.
In order to trace the history of secondary structure and the significance of mutations in the NSH1N1pdm clade of human IAV strains, swine IAVs also needed to be analyzed. In comparison with the H1N11918 ancestral strain, swine isolates from the NSsw_classic clade had three synonymous mutations U108C, G120A, U147C and one nonsynonymous mutation, G137A (R → K), which appeared in 1971. All together, they led to the formation of a multi-branch stem-loop structure in the 82–148 region. However, some substitutions led to stem-loop structure elimination in the NSsw_classic clade after 2000. In the swine NSsw_tri clade IAVs, one major mutation, G143A (S → D). Compared to swine isolates, viruses from the NSH1N1pdm clade had a single synonymous mutations, C127U, which was at the first position of the codon. Both G143A and C127U mutations did not influence a multi-branch stem-loop structure in the 82–148 region.
The NSsw_clas clade viruses had one mutation A552G (synonymous in NS1, D → G in NEP) in the 497–564 region, which did not disturb hairpin structure. NSsw_tri viruses had a G511A substitution, which eliminated the hairpin. Interestingly, in the first years of their isolation (1998–2000), swine triple reassortant IAVs had sequences in the 497–564 region that were identical to the A/H1N11918 ancestral strain sequence. Since 2009, the U513C substitution (synonymous in NS1, S → P in NEP) appeared periodically, but had no influence on secondary structure. At the same time, H1N1pdm2009 strains had only one substitution, A511U (N → Y in NS1, no effect for NEP), in comparison with viruses in the NSsw_tri clade, which led to hairpin secondary structure formation.
Compared to the H1N11918 strain, swine isolates in the NSsw_eur clade had three synonymous substitutions C90U, C96U and U99C and one nonsynonymous G139A (G → S). Initially, mutations C96U and U99C dominated, with C90U and G139A mutations appearing periodically. From 2010, all of these mutations were fixed, resulting in the conservation of the hairpin structure in the 82–148 region. Three major substitutions, G532A (V → I, synonymous for NEP) and a pair of G547A/G548A mutations (G → K in NS1; P → T in NEP), were found in the majority of NSsw_eur; they had no effect on hairpin structure in the 497–564 region.
Other avian-origin IAVs had two main substitutions: synonymous U102C and G143A (S → D) in the 82–148 region. This last one appeared and was fixed only in 2004. The majority of H5N1 human isolates have a hairpin structure in this region. The same is true of H7N9 isolates, although they have other synonymous substitutions (C96U, U99A, U102C, A129G) and a nonsynonymous substitutions G139A (G → S) that do not appear in the A/H1N11918 strain. These substitutions were also found in H9N2 isolates, but the secondary structure in this region was eliminated as a result of other random substitutions. In human H5N1 viruses, isolated in 1997–1998, there were two substitutions, U513A (N → Q in NS1, L → Q in NEP) and G532A (V → I in NS1, synonymous for NEP), which led to hairpin structure loss in the 497–564 region. From 2001–2003, only one mutation, G532A, appeared, and since 2003, two more substitutions, A512G (V → G in NS1; M → V in NEP) and G537C (synonymous for NS1; G → A in NEP), appeared and were fixed, leading to the re-emergence and maintenance of the hairpin structure. H7N9 viruses have no hairpin secondary structure and have five substitutions in their sequences: A513U (synonymous for NS; S → P in NEP), G514A (E → K in NS1; S → P in NEP), G532A (I → V in NS1, synonymous in NEP), G536A (G → Q in NS1 and G → R in NEP), G538A (V → I in NS1; synonymous for NEP), and C553U (L → F in NS1; synonymous for NEP).
The role of the observed hairpins is still to be determined. They could be associated with the downregulation of NS1 protein, since a mutation in the hairpin structure was shown to inhibit NS1 protein expression . RNA secondary structure in the 5′ and 3′ splice site region may play an important role in the splicing of IAV segment 8 [22, 23]. Interestingly, segment 7 of IAVs display a similar pattern of predicted secondary structure surrounding the splice sites. Our findings on host-specific patterns of RNA secondary structure are in accordance with species-specific global ordered RNA structure  and free energy distributions in the NS segment. The global RNA structure of the NS segment may have evolved for host specificity, since avian, swine, and human viruses replicate at distinct temperatures and pH values that are expected to influence RNA base pairing . RNA secondary structure can also be associated with viral pathogenicity. Hairpin structures in the IAV genome form dsRNA that can activate PKR pathways. Cytokine imbalance is recognized as a main molecular mechanism for complications in IAV infections [26, 27]. These observed differences in the RNA structures at the 5′ and 3′ splice sites in the NS genome segment suggest the need for experiments to test their functional role in the life cycle of IAVs.
influenza A virus
nuclear export protein
Vasin AV, Temkina OA, Egorov VV, Klotchenko SA, Plotnikova MA, Kiselev OI. Molecular mechanisms enhancing the proteome of influenza A viruses: an overview of recently discovered proteins. Virus Res. 2014;185:53–63.
Taubenberger JK, Kash JC. Influenza virus evolution, host adaptation, and pandemic formation. Cell Host Microbe. 2010;7:440–51.
Schrauwen EJ, de Graaf M, Herfst S, Rimmelzwaan GF, Osterhaus AD, Fouchier RA. Determinants of virulence of influenza A virus. Eur J Clin Microbiol Infect Dis. 2014;33:479–90.
Chursov A, Kopetzky SJ, Leshchiner I, Kondofersky I, Theis FJ, Frishman D, Shneider A. Specific temperature-induced perturbations of secondary mRNA structures are associated with the cold-adapted temperature-sensitive phenotype of influenza A virus. RNA Biol. 2012;9:8–16.
Gultyaev AP, Olsthoorn RCL. A family of non-classical pseudoknots in influenza A and B viruses. RNA Biol. 2010;7:125–9.
Ilyinskii PO, Schmidt T, Lukashev D, Meriin AB, Thoidis G, Frishman D, Shneider AM. Importance of mRNA secondary structural elements for the expression of influenza virus genes. OMICS. 2009;13:421–30.
Moss WN, Priore SF, Turner DH. Identification of potential conserved RNA secondary structure throughout influenza A coding regions. RNA. 2011;17:991–1011.
Priore SF, Kierzek E, Kierzek R, Baman JR, Moss WN, Dela-Moss LI, Turner DH. Secondary structure of a conserved domain in the intron of influenza A NS1 mRNA. PLoS One. 2013;8:e70615.
Gultyaev AP, Heus HA, Olsthoorn RCL. An RNA conformational shift in recent H5N1 influenza A viruses. Bioinformatics. 2007;23:272–6.
Lenartowicz E, Kesy J, Ruszkowska A, et al. Self-folding of naked segment 8 genomic RNA of influenza A virus. PLoS One. 2016;11(2):e0148281.
Cauldwell AV, Long JS, Moncorge O, Barclay WS. Viral determinants of influenza A virus host range. J Gen Virol. 2014;95:1193–210.
Kawaoka Y, Gorman OT, Ito T, Wells K, Donis RO, Castrucci MR, Donatelli I, Webster RG. Influence of host species on the evolution of the nonstructural (NS) gene of influenza A viruses. Virus Res. 1998;55:143–56.
Bao Y, Bolotov P, Dernovoy D, Kiryutin B, Zaslavsky L, Tatusova T, Ostell J, Lipman D. The influenza virus resource at the National Center for biotechnology information. J Virol. 2008;82:596–601.
Squires RB, Noronha J, Hunt V, et al. Influenza research database: an integrated bioinformatics resource for influenza research and surveillance. Influenza Other Respir Viruses. 2012;6(6):404–16.
Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30:772–80.
Gruber AR, Lorenz R, Bernhart SH, Neuböck R, Hofacker IL. The Vienna RNA websuite. Nucleic Acids Res. 2008;36((Web Server issue)):70–4.
Svoboda P, Cara ADI, Hairpin RNA. A secondary structure of primary importance. Cell Mol Life Sci. 2006;63:901–18.
Stamatakis A, Ludwig T, Meier H. RAxML-III: a fast program for maximum likelihood-based inference of large phylogenetic trees. Bioinformatics. 2005;21:456–63.
Taubenberger JK, Morens DM. 1918 influenza: the mother of all pandemics. Emerg Infect Dis. 2006;12:15–22.
Xu J, Zhong HA, Madrahimov A, Helikar T, Lu G. Molecular phylogeny and evolutionary dynamics of influenza A nonstructural (NS) gene. Infect Genet Evol. 2014;22:192–200.
Gaydos JC, Top FH, Hodder RA, Russell PK. Swine influenza A outbreak, Fort Dix, New Jersey, 1976. Emerg Infect Dis. 2006;12:23–8.
Plotch SJ, Krug RM. In vitro splicing of influenza viral NS1 mRNA and NS1-beta-globin chimeras: possible mechanisms for the control of viral mRNA splicing. Proc Natl Acad Sci USA. 1986;83:5444–8.
Nemeroff ME, Utans U, Krämer A, Krug RM. Identification of cis-acting intron and exon regions in influenza virus NS1 mRNA that inhibit splicing and cause the formation of aberrantly sedimenting presplicing complexes. Mol Cell Biol. 1992;12:962–70.
Simmonds P, Tuplin A, Evans DJ. Detection of genome-scale ordered RNA structure (GORS) in genomes of positive-stranded RNA viruses: implications for virus evolution and host persistence. RNA. 2004;10:1337–51.
Priore SF, Moss WN, Turner DH. Influenza A virus coding regions exhibit host-specific global ordered rna structure. PLoS One. 2012;7:e35989.
Morens DM, Taubenberger JK, Harvey HA, Memoli MJ. The 1918 influenza pandemic: lessons for 2009 and the future. Crit Care Med. 2010;38:e10–20.
Tisoncik JR, Korth MJ, Simmons CP, Farrar J, Martin TR, Katze MG. Into the eye of the cytokine storm. Microbiol Mol Biol Rev. 2012;76:16–32.
AV conceived of the study, carried the design of the study, phylogenetic analysis and drafted the manuscript. AP carried out sequence alignment and RNA secondary structure prediction, and participated in phylogenetic analysis. VE participated in RNA secondary structure prediction. MP and SK participated in sequence alignment. MK performed the statistical analysis, OK is the author of the idea and helped to carry the design of the study. All authors read and approved the final manuscript.
This work was supported by the Russian Science Foundation grant 15-15-00170.
The authors declare that they have no competing interests.