Transcriptomics data of 11 species of yeast identically grown in rich media and oxidative stress conditions

Blevins, William R.; Carey, Lucas B.; Albà, M. Mar

doi:10.1186/s13104-019-4286-0

Data note
Open access
Published: 03 May 2019

Transcriptomics data of 11 species of yeast identically grown in rich media and oxidative stress conditions

BMC Research Notes volume 12, Article number: 250 (2019) Cite this article

5318 Accesses
8 Citations
14 Altmetric
Metrics details

Abstract

Objective

The objective of this experiment was to identify transcripts in baker’s yeast (Saccharomyces cerevisiae) that could have originated from previously non-coding genomic regions, or de novo. We generated this data to be able to compare the transcriptomes of different species of Ascomycota.

Data description

We generated high-depth RNA sequencing data for 11 species of yeast: Saccharomyces cerevisiae, Saccharomyces paradoxus, Saccharomyces mikatae, Saccharomyces kudriavzevii, Saccharomyces bayanus, Naumovia castelii, Kluyveromyces lactis, Lachancea waltii, Lachancea thermotolerans, Lachancea kluyveri, and Schizosaccharomyces pombe. Using RNA-Seq from yeast grown in rich and oxidative conditions we created genome-guided de novo assemblies of the transcriptomes for each species. We included synthetic spike-in transcripts in each sample to determine the lower limit of detection of the sequencing platform as well as the reliability of our de novo transcriptome assembly pipeline. We subsequently compared the de novo transcripts assemblies to the reference gene annotations and generated assemblies that comprised both annotated and novel transcripts.

Objective

Due to pervasive transcription and pervasive translation in these yeast, new transcripts and ORFs can quickly appear in non-genic sequences and become exposed to selection. This process, known as de novo gene birth, can lead to the appearance of new genes with entirely novel functions. Our objective was to identify and characterize putative de novo genes in baker’s yeast to further understand the phenomenon of de novo gene birth. To correctly classify putative de novo genes via the taxonomic conservation of these unique sequences, we need comparable data for a set of closely related species. Due to the similarity of molecular pathways to more complex eukaryotes coupled with their ease of growth in the lab, budding yeasts have proved to be a popular group of organisms for experiments ranging from experimental evolution to genetic engineering. We selected these 11 species based on their sparse taxonomic distribution, their amenability to growth in a custom rich media, the availability of genome assemblies, and their inclusion in previous studies of de novo genes in yeast. We have used novel transcripts assembled from our RNA-Seq data, taken together with the reference annotations, to generate a more complete transcriptome for each of the eleven species surveyed. We have estimated the time that each S. cerevisiae transcript originated in the yeast phylogeny using homology searches and genomic synteny [1]. As organisms modify their expression and translation of genes in response to stress, we sequenced the transcriptomes of all 11 species of yeast in both rich media and oxidative stress conditions to capture potential transcriptome variability.

The availability of complete gene annotations is key for genome-wide studies. The transcript assemblies provided contain hundreds of transcripts that were not present in the available annotations, and thus provide a more complete view of the gene content of each organism than previous annotations. These transcriptomes can be used as a basis to discover new encoded proteins, to study the evolution of yeast gene families and to investigate the changes in gene expression across different Saccharomycotina species. The addition of the ERCC Spike-into all samples also allows for the benchmarking of different de novo transcriptome assembly protocols.

Data description

We grew 11 species of yeast in two conditions:

1.
Rich medium The yeast were grown in 20 mL of a custom rich medium [2], which was shown to accommodate various species of yeast, in 50 mL Erlenmeyer flasks at 30 °C. Cells were harvested in log growth phase at an OD₆₀₀ of approximately 0.25.
2.
Oxidative stress The same isogenic populations of yeast were grown in parallel, identical to the first condition. However, 30 min prior to harvesting the cells, hydrogen peroxide was added to a final concentration of 1.5 mM; we used a time period of 30 min to maximize the cellular response to stress [3], and a concentration of 1.5 mM H₂O₂ as we observed the yeast to grow approximately twice as slowly at this concentration.

After extraction, purification, and polyA selection of the RNA, synthetic spike-in transcripts from the ERCC RNA Spike-in kit [4] were added to each sample in order to assess the performance and limitations of our pipeline. After library preparation, the libraries were pooled into two batches (normal/stress) and sequenced in one lane on the Illumina HiSeq 2500 (paired-end, stranded, 50 bp long). This generated > 20 million high-quality strand-specific read pairs per sample (Table 1).

Table 1 Overview of data files

Full size table

After taking some quality control measures with our raw RNA-Seq data, we mapped the reads to their respective genomes (Table 1) and assembled de novo transcriptomes using the program Trinity version 2.1.0 [5]. We created a non-redundant set of features from the reference annotations combined with our de novo assembled transcripts; de novo assembled transcripts which correspond to annotated features according to Cuffmerge version 2.2.0 [6] were discarded, while those that did not were considered to be novel; we identified an average of 700 novel transcripts per species [1] (Table 1). The majority of these novel transcripts were found to be expressed in both conditions, but dozens of transcripts were only expressed in one condition or the other. Using the ERCC RNA Spike-in [4], we calculated that the lower limit of detection for annotated features in our pipeline was 2 TPM, and the lower limit of expression necessary to reliably assemble novel transcripts was 15 TPM; over half of the unannotated transcripts that we assembled were expressed above this conservative threshold of 15 TPM in at least one of the two conditions.

Limitations

A limitation of this dataset is that there are not multiple replicates for each species/condition, except for L. waltii, which has two replicates in each condition. We also would like to acknowledge that the concentration of hydrogen peroxide we used to induce an oxidative stress response (1.5 mM) was higher than the concentration used in other studies of oxidative stress response in yeast (0.1–1 mM).

Abbreviations

RNA-Seq:: RNA sequencing
TPM:: transcripts per million, a normalized measure of mRNA abundance
ERCC:: External RNA Control Consortium
mM:: millimolar, a measure of concentration
H₂O₂ :: hydrogen peroxide

References

Blevins WR, Ruiz-Orera J, Messeguer X, Blasco-Moreno B, Villanueva-Canas JL, Espinar L, et al. Frequent birth of de novo genes in the compact yeast genome. bioRxiv. 2019. https://doi.org/10.1101/575837.
Article Google Scholar
Tsankov AM, Thompson DA, Socha A, Regev A, Rando OJ. The role of nucleosome positioning in the evolution of gene regulation. PLoS Biol. 2010. https://doi.org/10.1371/journal.pbio.1000414.
Article PubMed PubMed Central Google Scholar
Gasch AP, Spellman PT, Kao CM, Carmel-Harel O, Eisen MB, Storz G, et al. Genomic expression programs in the response of yeast cells to environmental changes. Mol Biol Cell. 2000;11(12):4241–57. https://doi.org/10.1091/mbc.11.12.4241.
Article CAS PubMed PubMed Central Google Scholar
Rna E, Consortium C. Proposed methods for testing and selecting the ERCC external RNA controls. BMC Genomics. 2005;6:150. https://doi.org/10.1186/1471-2164-6-150.
Article CAS Google Scholar
Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011;29(7):644–52. https://doi.org/10.1038/nbt.1883.
Article CAS PubMed PubMed Central Google Scholar
Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc. 2012;7(3):562–78. https://doi.org/10.1038/nprot.2012.016.
Article CAS PubMed PubMed Central Google Scholar
Transcriptomic comparison of 11 species of yeast in rich media and oxidative stress conditions. Sequence Read Archive. 2019. http://identifiers.org/ncbi/insdc.sra:SRP187756. Accessed 16 Mar 2019.
Blevins, W. Combination of novel transcripts from de novo assembly and genes from reference annotations for 11 species. figshare. 2019. https://doi.org/10.6084/m9.figshare.7851521.v2.

Download references

Authors’ contributions

MMA and LBC designed and funded the experiments. WRB carried out the experiments and wrote the manuscript. All authors read and approved the manuscript.

Acknowledgements

We thank Dr. Ksenia Pugach and the Verstreppen lab for cultures of several species of yeast, and the Sequencing Facilities at the Center for Regulatory Genomics (CRG) for the library preparation and sequencing. We also thank Lorena Espinar, Bernat Blasco-Moreno, and Leire de Campos-Mata for their help in preparing the samples.

Competing interests

The authors declare that they have no competing interests.

Availability of data materials

The data described in this Data Note can be freely and openly accessed on the Sequence Read Archive (SRA) with Project ID SRP187756 http://identifiers.org/ncbi/insdc.sra:SRP187756 [7] and on Figshare at https://doi.org/10.6084/m9.figshare.7851521.v2 [8] (Table 1). We have performed numerous additional analyses using this data which are available from the corresponding author on reasonable request; more information can be found in the text and supplementary material of our preprint “Frequent birth of de novo genes in the compact yeast genome” [1].

Consent for publication

Not applicable.

Ethics approval and consent to participate

Not applicable.

Funding

The work was funded by the following grants: (1) BFU2015-65235-P Ministerio de Economía e Innovación (Spanish Government)-FEDER (EU). (2) BFU2015-68351-P Ministerio de Economía e Innovación (Spanish Government)-FEDER (EU). (3) MDM-2014-0370 “Maria de Maeztu” Programme for Units of Excellence in R&D (Spanish Government). (4) 2017SGR1054 Agència de Gestió d’Ajuts Universitaris i de Recerca Generalitat de Catalunya. (5) 2017SGR01020 Agència de Gestió d’Ajuts Universitaris i de Recerca Generalitat de Catalunya. (6) Predoctoral fellowship (FI, Generalitat de Catalunya) to WRB. Grants 1, 3 and 4 were used to cover lab expenses and to obtain the RNA samples from yeast cultures. Grants 2 and 5 were used for RNA sequencing. Grant 6 covered the salary of WRB. These funding bodies had no role in the design of the study, collection of the data, analysis of the results, or writing of the manuscript.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Author information

Authors and Affiliations

Evolutionary Genomics Group, Research Programme on Biomedical Informatics, Hospital del Mar Research Institute and Universitat Pompeu Fabra, Barcelona, Spain
William R. Blevins & M. Mar Albà
Single Cell Behavior Group, Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Barcelona, Spain
William R. Blevins & Lucas B. Carey
Center for Quantitative Biology and Peking-Tsinghua Joint Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
Lucas B. Carey
Catalan Institution for Research and Advanced Studies, Barcelona, Spain
M. Mar Albà

Authors

William R. Blevins
View author publications
You can also search for this author in PubMed Google Scholar
Lucas B. Carey
View author publications
You can also search for this author in PubMed Google Scholar
M. Mar Albà
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to William R. Blevins.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article

Blevins, W.R., Carey, L.B. & Albà, M.M. Transcriptomics data of 11 species of yeast identically grown in rich media and oxidative stress conditions. BMC Res Notes 12, 250 (2019). https://doi.org/10.1186/s13104-019-4286-0

Download citation

Received: 19 March 2019
Accepted: 25 April 2019
Published: 03 May 2019
DOI: https://doi.org/10.1186/s13104-019-4286-0

Transcriptomics data of 11 species of yeast identically grown in rich media and oxidative stress conditions

Abstract

Objective

Data description

Objective

Data description

Limitations

Abbreviations

References

Authors’ contributions

Acknowledgements

Competing interests

Availability of data materials

Consent for publication

Ethics approval and consent to participate

Funding

Publisher’s Note

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

BMC Research Notes

Contact us

Transcriptomics data of 11 species of yeast identically grown in rich media and oxidative stress conditions

Abstract

Objective

Data description

Objective

Data description

Limitations

Abbreviations

References

Authors’ contributions

Acknowledgements

Competing interests

Availability of data materials

Consent for publication

Ethics approval and consent to participate

Funding

Publisher’s Note

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Research Notes

Contact us