Influenza B virus has global ordered RNA structure in (+) and (−) strands but relatively less stable predicted RNA folding free energy than allowed by the encoded protein sequence
© Priore et al.; licensee BioMed Central Ltd. 2013
Received: 15 January 2013
Accepted: 3 July 2013
Published: 19 August 2013
Influenza A virus contributes to seasonal epidemics and pandemics and contains Global Ordered RNA structure (GORS) in the nucleoprotein (NP), non-structural (NS), PB2, and M segments. A related virus, influenza B, is also a major annual public health threat, but unlike influenza A is very selective to human hosts. This study extends the search for GORS to influenza B.
A survey of all available influenza B sequences reveals GORS in the (+) and (−)RNAs of the NP, NS, PB2, and PB1 gene segments. The results are similar to influenza A, except GORS is observed for the M1 segment of influenza A but not for PB1. In general, the folding free energies of human-specific influenza B RNA segments are less stable than allowable by the encoded amino acid sequence. This is consistent with findings in influenza A, where human-specific influenza RNA folds are less stable than avian and swine strains.
These results reveal fundamental molecular similarities and differences between Influenza A and B and suggest a rational basis for choosing segments to target with therapeutics and for viral attenuation for live vaccines by altering RNA folding stability.
KeywordsRNA RNA secondary structure Influenza Influenza A Influenza B Structural bioinformatics GORS
In contrast to influenza A, a zoonotic pathogen that infects multiple host species, influenza B primarily infects humans and, rarely, seals [1, 2]. Influenza B also differs from influenza A by having a lower mutation rate and fewer antigenic serotypes . Though its lack of antigenic diversity bars pandemic outbreaks, influenza B contributes to seasonal occurrences of influenza, which can result in serious infections costing thousands of lives and billions of dollars [4, 5]. Influenza B has been of increasing concern lately, due to the rise in circulation of two distinct lineages of the virus: Victoria and Yamagata, which stimulated the recent switch from a trivalent vaccine (against one influenza B and two influenza A serotypes) to a quadrivalent vaccine including both influenza B serotypes [6, 7]. The viral genome is comprised of eight negative sense, or (−)RNA, segments. Segments NS, M1/BM2, and NA encode multiple protein products via alternative initiation, termination-reinitiation, and splicing, respectively .
RNA secondary structure plays important roles in the biology of many viruses: for example, in gene expression , splicing , molecular stability/life-time , and control of host gene expression . Some RNAs, such as compact viral genomes, can encode both protein information and functional RNA secondary structures . The importance of RNA structure in influenza virus protein coding regions, or (+)RNA, is now being revealed. For influenza A, structures have been described towards the 5′ end  and at the 3′ splice site [15, 16] of segment NS (+)RNA. Both structures may have a role in the regulation of splicing. When many sequences are available, predicted folding stabilities can identify RNA regions likely to have structure . A survey of all influenza A coding sequences found evidence for multiple sites with probable locally conserved RNA structure in the (+)RNA . Similar to segment NS, structures were discovered in the 5′ region and 3′ splice site of segment M. The structure at the 3′ splice site can switch between pseudoknot and hairpin conformations, respectively, burying or revealing the splice site and other splicing signals . Thus, this structure may have a role in regulation of segment M splicing.
Here ΔG°37, wild-type is the predicted folding free energy of the wild-type sequence, μ is the average predicted folding free energy of the dinucleotide randomizations, and σ is the standard deviation of the randomized population. GORS is defined as a significant negative shift in the median z-score away from an ideal non-structured RNA population (i.e. a normal distribution centered at zero). Thus, segments with a median z-score below −0.67 are considered to have GORS.
While free energy minimization has limited accuracy and, in most algorithms, forbids pseudoknots , it can on average correctly predict roughly 73% of base pairs . Estimating free energies is an easier problem. For example, structures with greater than 86% of correctly predicted base pairs typically differ from the minimum free energy structure by an average of only 5% in their ΔG°37 values . Thus, good estimations of the relative thermodynamic stability within the same segment and between wild-type and matched randomized controls is achievable.
Many RNA viruses have negative shifts in z-scores for (+)RNAs relative to unstructured sequences [25, 26], implying widespread RNA structure. Studies in bacterial mRNAs found similar patterns . Influenza A has GORS in both orientations of the NP, NS, PB2, and M gene segments. Generally in influenza A, avian strains are the most stable, followed by swine and then human . A similar trend was found for the z-scores of NP, NS, and PB2 gene segments. The exact role of GORS is unclear, but may be a mechanism for evasion of the host innate immune system  or for controlling mRNA life-time/stability . Identification of segments with and without GORS could help guide discovery of targets for small molecules and oligonucleotide therapeutics against influenza virus, since these approaches require structured and unstructured RNA targets, respectively.
This study extends to influenza B the search for global trends in RNA structure. Because only human influenza B strains are available, the folding free energies and z-scores of influenza B sequences are compared to folding free energies and z-scores of synonymous codon mutations (i.e. sequences that code for the same protein as wild-type influenza B sequences) generated in silico. Additional comparisons are made between results for influenza A and B. Similarities and differences are observed, which imply that influenza B has a distinctly different biology from influenza A.
Materials and methods
The research in our lab, including the content of this manuscript, has been performed with the approval of the University of Rochester’s research ethics committee.
Coding regions for all unique influenza B mRNAs were downloaded from the NCBI Influenza Virus Resource Page . Truncated sequences or those with ambiguous nucleotides were removed, leaving 4110 sequences: 370 in NP, 519 in NS, 363 in PB2, 339 in PB1, 350 in M1, 832 in HA, 354 in PA, and 983 in NA. RNA folding free energies for the entire coding regions were predicted by minimizing the ΔG°37 with the program RNA fold . Z-scores  were calculated for all sequences by comparing the free energy of wild-type sequences to sets of ten randomized sequences, which preserved dinucleotide content using the Simmonics Sequence Editor [31, 32]. A negative z-score implies GORS . In this work, a population of single sequences with a median z-score below −0.67 is considered to possess GORS. We will apply the same definition to a reanalysis of our previous results for influenza A .
To generate sets of synonymous codon mutants for comparison with folding free energies and z-scores of wild-type sequences, one coding region for each of the eight segments was mutated in silico to produce eight sets of 500 synonymous mutant sequences. Five hundred randomizations of one sequence from each segment was considered sufficient because the protein sequences are ~100% conserved in the available influenza B sequences. Synonymous codon mutations were made with a PERL script that randomly selected codons and made synonymous substitution at those sites, including substituting the same codon (no change). Folding free energy and z-scores were calculated as described above for wild-type. Specifically, ten dinucleotide randomizations of each of the 500 synonymous codon mutants were used for calculating 500 z-scores for each influenza B segment.
Box plots were constructed for each population of predicted free energies and z-scores. The box on each plot represents the interquartile range (IQR) which is defined as the difference between the 75th percentile (Q3) and 25th percentile (Q1) of each population. Upper and lower bounds for each plot (bars extending from the box) represent the largest and smallest data values within 1.5 × IQR of the Q3 and Q1, respectively. Values outside of this area are considered anomalous for that population.
Median z-scores and average predicted folding free energy for influenza B (+)RNA, (−)RNA and synonymous codon mutant (mut (+)RNAs)
Z-score Mut (+)RNA
Average GC content of wild-type (+)RNA influenza B sequences and synonymous codon mutant sequences
Avg. % GC wild-type (+)RNA
Avg. % GC Mut (+) RNA
Z-scores were also calculated for the synonymous codon mutant sets. Compared to distributions of the four wild-type sequences with evidence of GORS, all but the NS segment mutants still possess GORS. In the three cases, however, the median z-scores for mutants were more positive than for wild-type sequences (Table 1, Figures 1 and 2).
Predictions of GORS can partition RNA sequences into regions with or without strong secondary structure. Such partitioning should be helpful in identifying regions easier to target with therapeutics. For example, small molecules will bind specifically to structured regions, whereas oligonucleotide based therapeutics will bind more tightly to unstructured regions. Prediction of regions with GORS may also facilitate genome-wide probing of secondary structure [33–35] by focusing searches to regions likely to have conserved structure.
For influenza B, three of the four gene segments with GORS have homologs in influenza A that also show GORS : NP, NS, and PB2. Unlike influenza A, there is no evidence for GORS in the influenza B M1/BM2 gene. A possible explanation for this lack of GORS is that in influenza A, segment M encodes both the M1 (matrix protein) and M2 (ion channel) proteins, which are alternatively spliced, whereas in influenza B the BM2 open reading frame directly follows M1 and is translated via termination-reinitiation [36, 37]. In influenza A, local RNA structures have been described that have implications for splicing [15, 18, 19]. Perhaps GORS is absent in influenza B M1/BM2 because there is no need for RNA structures important for splicing.
In influenza B, the PB1 coding region shows strong evidence of GORS (median z-score of −1.5), in contrast to influenza A where the average z-scores are equal to or more positive than −0.5 . This suggests PB1 of influenza B must maintain structure to stabilize mRNA for some yet unknown reason that is not present for influenza A PB1. Interestingly, the (−)RNA z-score for this region is more favorable than the (+)RNA. This suggests an important role for structure in the genomic RNA for this segment, with structure in the (+)RNA representing a structural “echo”.
The result of less favorable relative thermodynamic stability of influenza B sequences when compared with a set of randomly generated synonymous codon sequences is consistent with the human host species specificity of influenza B. For influenza A, sequences specific to humans have less favorable thermodynamic stability than swine and avian species, even though protein sequence is largely conserved . However, any changes in thermodynamic stability in synonymous codon mutants for all segments appears to be independent of GORS because the average z-score for the mutants was close to zero. A decrease of CpG dinucleotide frequencies in human influenza viruses has been established . As seen in Table 2, synonymous codon mutants acquired increased GC content, which increased their predicted thermodynamic stability, compared to wild-type sequences. This is consistent with the increased GC content of avian influenza A strains compared to human influenza A strains . It appears that evolution, acting to reduce CpG frequency or other factors related to the human host, restricts the thermodynamic stability of influenza B sequences to a small portion of the available folding landscape. Thus, this thermodynamic difference may distinguish human-adapted influenza strains from strains that replicate in other host species.
This work elucidates some of the thermodynamic and structural constraints that may be acting on influenza B RNA sequences and human influenza viruses in general. Some characteristics are shared between influenza B and A: GORS is seen in NS, NP, and PB2 RNAs of both viral species. With the exception of influenza B HA, ΔG°37 favors folding in the (+)RNA over the (−)RNA, and the human-specific wild-type influenza B sequences have less favorable thermodynamic stability than allowed by the amino acid sequence. This latter trend was also seen in human influenza A viruses when compared to swine and avian strains . Differences with influenza A are also apparent: For influenza B, the PB1 RNA shows GORS, while influenza A has GORS in the M gene segment. These results imply differences in the role of RNA folding in the two viral groups. A better understanding of the constraints acting on influenza B sequences may aid in the rational attenuation of viral strains for use in vaccines, as has been recently shown with the influenza B NP segment .
Availability of supporting data
The data supporting the results of this article are included within the article (and its additional files).
Global ordered RNA structure.
The authors thank Prof. David H. Mathews for helpful discussions. This work was supported by NIH RO1 GM22939. SFP is a trainee in the Medical Scientist Training Program funded by NIH T32 GM07356. The contents of this work are solely the responsibility of the authors and do not necessarily represent the official views of the NIH.
- Osterhaus AD, Rimmelzwaan GF, Martina BE, Bestebroer TM, Fouchier RA: Influenza B virus in seals. Science. 2000, 288: 1051-1053. 10.1126/science.288.5468.1051.PubMedView ArticleGoogle Scholar
- Geraci JR, St Aubin DJ, Barker IK, Webster RG, Hinshaw VS, Bean WJ, Ruhnke HL, Prescott JH, Early G, Baker AS, et al: Mass mortality of harbor seals: pneumonia associated with influenza A virus. Science. 1982, 215: 1129-1131. 10.1126/science.7063847.PubMedView ArticleGoogle Scholar
- Hay AJ, Gregory V, Douglas AR, Lin YP: The evolution of human influenza viruses. Phil Trans R Soc Lond B Biol Sci. 2001, 356: 1861-1870.View ArticleGoogle Scholar
- Molinari NA, Ortega-Sanchez IR, Messonnier ML, Thompson WW, Wortley PM, Weintraub E, Bridges CB: The annual impact of seasonal influenza in the US: measuring disease burden and costs. Vaccine. 2007, 25: 5086-5096. 10.1016/j.vaccine.2007.03.046.PubMedView ArticleGoogle Scholar
- Thompson WW, Shay DK, Weintraub E, Brammer L, Cox N, Anderson LJ, Fukuda K: Mortality associated with influenza and respiratory syncytial virus in the United States. JAMA. 2003, 289: 179-186. 10.1001/jama.289.2.179.PubMedView ArticleGoogle Scholar
- Lee BY, Bartsch SM, Willig AM: The economic value of a quadrivalent versus trivalent influenza vaccine. Vaccine. 2012, 30: 7443-7446. 10.1016/j.vaccine.2012.10.025.PubMedPubMed CentralView ArticleGoogle Scholar
- Ambrose CS, Levin MJ: The rationale for quadrivalent influenza vaccines. Hum Vaccin Immunother. 2012, 8: 81-88.PubMedPubMed CentralView ArticleGoogle Scholar
- Bouvier NM, Palese P: The biology of influenza viruses. Vaccine. 2008, 26 (Suppl 4): D49-D53.PubMedPubMed CentralView ArticleGoogle Scholar
- Pfingsten JS, Kieft JS: RNA structure-based ribosome recruitment: lessons from the Dicistroviridae intergenic region IRESes. RNA. 2008, 14: 1255-1263. 10.1261/rna.987808.PubMedPubMed CentralView ArticleGoogle Scholar
- Warf MB, Berglund JA: Role of RNA structure in regulating pre-mRNA splicing. Trends Biochem Sci. 2010, 35: 169-178. 10.1016/j.tibs.2009.10.004.PubMedPubMed CentralView ArticleGoogle Scholar
- Mitton-Fry RM, DeGregorio SJ, Wang J, Steitz TA, Steitz JA: Poly(A) tail recognition by a viral RNA element through assembly of a triple helix. Science. 2010, 330: 1244-1247. 10.1126/science.1195858.PubMedPubMed CentralView ArticleGoogle Scholar
- Steitz JA, Borah S, Cazalla D, Fok V, Lytle R, Mitton-Fry R, Riley K, Samji T: Noncoding RNPs of viral origin. Cold Spring Harb Perspect Biol. 2011, 3: a005165-10.1101/cshperspect.a005165.PubMedPubMed CentralView ArticleGoogle Scholar
- Pedersen JS, Meyer IM, Forsberg R, Simmonds P, Hein J: A comparative method for finding and folding RNA secondary structures within protein-coding regions. Nucleic Acids Res. 2004, 32: 4925-4936. 10.1093/nar/gkh839.PubMedPubMed CentralView ArticleGoogle Scholar
- Ilyinskii PO, Schmidt T, Lukashev D, Meriin AB, Thoidis G, Frishman D, Shneider AM: Importance of mRNA secondary structural elements for the expression of influenza virus genes. OMICS. 2009, 13: 421-430. 10.1089/omi.2009.0036.PubMedView ArticleGoogle Scholar
- Gultyaev AP, Heus HA, Olsthoorn RC: An RNA conformational shift in recent H5N1 influenza A viruses. Bioinformatics. 2007, 23: 272-276. 10.1093/bioinformatics/btl559.PubMedView ArticleGoogle Scholar
- Gultyaev AP, Olsthoorn RC: A family of non-classical pseudoknots in influenza A and B viruses. RNA Biol. 2010, 7: 125-129. 10.4161/rna.7.2.11287.PubMedView ArticleGoogle Scholar
- Washietl S, Hofacker IL, Stadler PF: Fast and reliable prediction of noncoding RNAs. Proc Natl Acad Sci U S A. 2005, 102: 2454-2459. 10.1073/pnas.0409169102.PubMedPubMed CentralView ArticleGoogle Scholar
- Moss WN, Priore SF, Turner DH: Identification of potential conserved RNA secondary structure throughout influenza A coding regions. RNA. 2011, 17: 991-1011. 10.1261/rna.2619511.PubMedPubMed CentralView ArticleGoogle Scholar
- Moss WN, Dela-Moss LI, Kierzek E, Kierzek R, Priore SF, Turner DH: The 3' splice site of influenza A segment 7 mRNA can exist in two conformations: a pseudoknot and a hairpin. PLoS One. 2012, 7: e38323-10.1371/journal.pone.0038323.PubMedPubMed CentralView ArticleGoogle Scholar
- Priore SF, Moss WN, Turner DH: Influenza A virus coding regions exhibit host-specific global ordered RNA structure. PLoS One. 2012, 7: e35989-10.1371/journal.pone.0035989.PubMedPubMed CentralView ArticleGoogle Scholar
- Clote P, Ferre F, Kranakis E, Krizanc D: Structural RNA has lower folding energy than random RNA of the same dinucleotide frequency. RNA. 2005, 11: 578-591. 10.1261/rna.7220505.PubMedPubMed CentralView ArticleGoogle Scholar
- Sperschneider J, Datta A: An introduction to RNA structure and pseudoknot prediction. In Algorithms in Computational Molecular Biology. 2011, John Wiley & Sons, Inc, 521-546. 10.1002/9780470892107.ch24.Google Scholar
- Mathews DH, Disney MD, Childs JL, Schroeder SJ, Zuker M, Turner DH: Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure. Proc Natl Acad Sci U S A. 2004, 101: 7287-7292. 10.1073/pnas.0401799101.PubMedPubMed CentralView ArticleGoogle Scholar
- Mathews DH, Sabina J, Zuker M, Turner DH: Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. J Mol Biol. 1999, 288: 911-940. 10.1006/jmbi.1999.2700.PubMedView ArticleGoogle Scholar
- Simmonds P, Tuplin A, Evans DJ: Detection of genome-scale ordered RNA structure (GORS) in genomes of positive-stranded RNA viruses: implications for virus evolution and host persistence. RNA. 2004, 10: 1337-1351. 10.1261/rna.7640104.PubMedPubMed CentralView ArticleGoogle Scholar
- Davis M, Sagan SM, Pezacki JP, Evans DJ, Simmonds P: Bioinformatic and physical characterizations of genome-scale ordered RNA structure in mammalian RNA viruses. J Virol. 2008, 82: 11824-11836. 10.1128/JVI.01078-08.PubMedPubMed CentralView ArticleGoogle Scholar
- Katz L, Burge CB: Widespread selection for local RNA secondary structure in coding regions of bacterial genes. Genome Res. 2003, 13: 2042-2051. 10.1101/gr.1257503.PubMedPubMed CentralView ArticleGoogle Scholar
- Deutscher MP: Degradation of RNA in bacteria: comparison of mRNA and stable RNA. Nucleic Acids Res. 2006, 34: 659-666. 10.1093/nar/gkj472.PubMedPubMed CentralView ArticleGoogle Scholar
- Bao Y, Bolotov P, Dernovoy D, Kiryutin B, Zaslavsky L, Tatusova T, Ostell J, Lipman D: The influenza virus resource at the national center for biotechnology information. J Virol. 2008, 82: 596-601. 10.1128/JVI.02005-07.PubMedPubMed CentralView ArticleGoogle Scholar
- Hofacker IL, Fontana W, Stadler PF, Bonhoeffer LS, Tacker M, Schuster P: Fast folding and comparison of RNA secondary structures. Monatsh Chem. 1994, 125: 167-188. 10.1007/BF00818163.View ArticleGoogle Scholar
- Simmonds P, Smith DB: Structural constraints on RNA virus evolution. J Virol. 1999, 73: 5787-5794.PubMedPubMed CentralGoogle Scholar
- Simmonds P: SSE: a nucleotide and amino acid sequence analysis platform. BMC Res Notes. 2012, 5: 50-10.1186/1756-0500-5-50.PubMedPubMed CentralView ArticleGoogle Scholar
- Lucks JB, Mortimer SA, Trapnell C, Luo S, Aviran S, Schroth GP, Pachter L, Doudna JA, Arkin AP: Multiplexed RNA structure characterization with selective 2'-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq). Proc Natl Acad Sci U S A. 2011, 108: 11063-11068. 10.1073/pnas.1106501108.PubMedPubMed CentralView ArticleGoogle Scholar
- Kertesz M, Wan Y, Mazor E, Rinn JL, Nutter RC, Chang HY, Segal E: Genome-wide measurement of RNA secondary structure in yeast. Nature. 2010, 467: 103-107. 10.1038/nature09322.PubMedView ArticleGoogle Scholar
- Underwood JG, Uzilov AV, Katzman S, Onodera CS, Mainzer JE, Mathews DH, Lowe TM, Salama SR, Haussler D: FragSeq: transcriptome-wide RNA structure probing using high-throughput sequencing. Nat Meth. 2010, 7: 995-1001. 10.1038/nmeth.1529.View ArticleGoogle Scholar
- Powell ML, Napthine S, Jackson RJ, Brierley I, Brown TD: Characterization of the termination-reinitiation strategy employed in the expression of influenza B virus BM2 protein. RNA. 2008, 14: 2394-2406. 10.1261/rna.1231008.PubMedPubMed CentralView ArticleGoogle Scholar
- Horvath CM, Williams MA, Lamb RA: Eukaryotic coupled translation of tandem cistrons: identification of the influenza B virus BM2 polypeptide. EMBO J. 1990, 9: 2639-2647.PubMedPubMed CentralGoogle Scholar
- Greenbaum BD, Levine AJ, Bhanot G, Rabadan R: Patterns of evolution and host gene mimicry in influenza and other RNA viruses. PLoS Pathog. 2008, 4: e1000079-10.1371/journal.ppat.1000079.PubMedPubMed CentralView ArticleGoogle Scholar
- Rabadan R, Levine AJ, Robins H: Comparison of avian and human influenza A viruses reveals a mutational bias on the viral genomes. J Virol. 2006, 80: 11887-11891. 10.1128/JVI.01414-06.PubMedPubMed CentralView ArticleGoogle Scholar
- Wanitchang A, Narkpuk J, Jaru-ampornpan P, Jengarn J, Jongkaewwattana A: Inhibition of influenza A virus replication by influenza B virus nucleoprotein: an insight into interference between influenza A and B viruses. Virology. 2012, 432: 194-203. 10.1016/j.virol.2012.06.016.PubMedView ArticleGoogle Scholar