Skip to main content


We're creating a new version of this page. See preview

  • Data Note
  • Open Access

EcoBrowser: a web-based tool for visualizing transcriptome data of Escherichia coli

BMC Research Notes20114:405

  • Received: 1 April 2011
  • Accepted: 13 October 2011
  • Published:



Escherichia coli has been extensively studied as a prokaryotic model organism whose whole genome was determined in 1997. However, it is difficult to identify all the gene products involved in diverse functions by using whole genome sequencesalone. The high-resolution transcriptome mapping using tiling arrays has proved effective to improve the annotation of transcript units and discover new transcripts of ncRNAs. While abundant tiling array data have been generated, the lack of appropriate visualization tools to accommodate and integrate multiple sources of data has emerged.


EcoBrowser is a web-based tool for visualizing genome annotations and transcriptome data of E. coli. Important tiling array data of E. coli from different experimental platforms are collected and processed for query. An AJAX based genome browser is embedded for visualization. Thus, genome annotations can be compared with transcript profiling and genome occupancy profiling from independent experiments, which will be helpful in discovering new transcripts including novel mRNAs and ncRNAs, generating a detailed description of the transcription unit architecture, further providing clues for investigation of prokaryotic transcriptional regulation that has proved to be far more complex than previously thought.


With the help of EcoBrowser, users can get a systemic view both from the vertical and parallel sides, as well as inspirations for the design of new experiments which will expand our understanding of the regulation mechanism.


  • Genome Annotation
  • Genome Browser
  • Transcriptome Data
  • Transcription Unit
  • Tiling Array


In the past decade, advances on high-throughput sequencing technologies have already made a huge impact on microbiology, providing a fast and economical means of determining whole genome sequences of bacteria [1]. For instance, most of the current completed genome-sequence projects listed on Genomes OnLine Database are microbial. The genome needs to be annotated by identifying the locations and functions of genes. Specifically, the in-depth organizational structure of bacterial genomes still needs to be fully elucidated.

Escherichia coli has been widely used as a prokaryotic model organism whose whole genome was sequenced as early as 1997 [2]. The information about its genes, proteins, intergenic regions and biochemical machineries have been collected in the well known databases, including EcoGene, EcoCyc and EcoliWiki [35]. However, identifying all the gene products involved in diverse functions has proved difficult to accomplish solely based on whole genome sequences. Thus, microarray data serve as useful complementary information for functional genomics. Some databases are built based on the microarray data like GenExpDB [6]. GenExpDB brings together an extensive collection of gene expression data from the E.coli community, so that the gene expression level in different conditions and platforms can be easily compared. Recent advance in biology suggests a wide-spread involvement of noncoding RNA in transcript regulations, but the design of gene microarray can only cover the gene coding regions of the whole genome and many new techniques are aiming to investigate the regulation of no-coding regions. As an unbiased tool to investigate protein binding, gene expression and gene structure on a genome-wide scope, tiling arrays has improved the annotation of transcript units and the discovery of many new transcripts of non-coding and natural antisense RNA [7, 8]. While abundant tiling array data have been generated, the lack of appropriate visualization tools to accommodate and integrate multiple sources of data has emerged. The widely used genome browsers such as UCSC genome browser and Ensembl Bacteria reload the entire genome browser page by every action [9, 10]. The discontinuous page transitions impair the user's sense of which genomic locus they are viewing and how the displayed data points relate to one another. In addition, as the size of tiling array data is usually very huge, it is also time consuming to upload and display them on the browser server.

We therefore built EcoBrowser which is a web-based visualization tool for searching genome annotations through transcriptome expression profiles of E.coli. The major difference between EcoBrowser and GenExpDB is that GeneExpDB focuses on gene expression data. EcoBrowser focuses on visualizing the whole-transcriptome mapping data such as tiling array, therefore the expression level of both coding region and non-coding region can be included and led to further integration analysis. The expression value were transformed into shapes of bule colors for drawing the heatmaps. The heatmap of whole genome were pre-rendered as tiles of images at multiple zoom levels and stored on the server-side. With the help of AJAX technology, a smooth panning and zooming effect can be created by dynamically changing the positional offset of these tiles, fetching new tile images when necessary (without reload the whole page). Thus, genome-wide comparison of expression patterns from independent experiments and genome annotation can be performed by direct comparison which will be helpful in discovering new transcripts, non-coding RNAs and generating a detailed description of the transcription unit architecture. It could also provide clues for further investigation of condition-specific transcriptional regulation.



The EcoBrowser is composed of a web interface, a database as well as an AJAX based genome browser [11]. The user interface is written in Perl and implemented by using Perl's Common Gateway Interface module ( and Cascaded Style Sheets (CSS). The database stores integrated identified genes and transcription units information obtained from NCBI, EcoCyc and EcoGene [3, 4, 12]. The transcription unit annotation of E. coli is also included according to a recent study [8]. Gene symbole, gene id, transcription unit id and modular unit id can be queried. All the transcriptome datasets about transcriptome analysis were downloaded from Gene Expression Omnibus (GEO). Currently, there are 67 tiling arrays from five publications in EcoBrowser, the description of the data used for the tracks can be found in the "Help" page [8, 1316]. The transcriptome data are displayed by a genome-based heatmap and rendered into a series of images by the statistical language R. In order to make the results from different platforms comparable, we calculte the relatve signal (ranging from 0 to 1) using the following formula:
S r e l a t i v e = S i - min S max S - min S

where S i means the signal value of the ith gene, S represents [S1, S2,... S n ], where n is the number of genes. The shade of blue represents the relative expression level of the probes which continuously cover the entire genome in each track. Jbrowse is to navigate trough the gene and transcription unit predictions [11, 17]. The AJAX-based browser offers a faster and smoother navigation through the genome without reloading of the page. The genome annotations are rendered on the client side while the transcriptome expression heatmaps are prerendered and stored on the server.

Results and Discussion

EcoBrowser provides a user-friendly interface. Users can select genomic regions of interest (e.g. via gene or locus IDs) and then select the transcriptome data to be displayed simultaneously on the search page. Taking a well studied heat shock gene, groS (b4142), for example, identified genes or transcription units information is returned by clicking the "Search" button; the list of the optional datasets and annotations shows up by clicking the "display" button. EcoBrowser includes two types of transcriptome analysis data generated by tiling array, transcript expression profiling (like RNA_heat, RNA_logphase) and genome binding/occupancy profiling (like GB_heat, GB_logphase, GB_logphase_rif). Here we choose the datasets including RNA_heat, RNA_logphase, GB_heat, GB_logphase, GB_logphase_rif, and the gene location. More details are on the help page. After clicking the "browse selected button" the selected datasets and annotations will be visualized at the position where the selected gene entry is located (Figure 1). Users can also add or remove tracks to dynamically generate customized views. Hence, a straightforward comparison of the transcriptome data from different sources and under various conditions can be performed.
Figure 1
Figure 1

A snapshot of EcoBrowser. A snapshot of EcoBrowser displays the gene location and transcriptome data. The tracks in the left panel could be dynamically added and removed by dragging. The shade of blue represents the relative expression level of the probes and the description of the tracks are in the "help" page.

In the case of groS (b4142) and groL (b4143), the two adjacent genes belonging to the same operon are shown to be co-expressed in the tracks RNA_heat_plus and RNA_logphase_plus. RNA polymerase (RNAP) binds to the gene regions of groS and groL by pulses of heat (GB_heat) while not in the log phase (GB_logphase). The above indicates that firstly the transcription of .groS and groL are activated by the heat pulse; secondly, the transcript of groS and groL are still kept in a high level in the log-phase condition due to their essential role in protein maintenance and cell growth. After combining the static map of Rifampicin-induced RNAP-binding promoter regions (GB_logphase_rif), users can get a better understanding of the process of groS and groL transcription. More findings can be revealed by extending the object to more genes of the whole genome as well as more species.

About 80 of hundreds of predicted sRNAs candidates in silico have been experimentally validated in E.coli. However, many more predicted sRNAs located in the intergenic regions shows a high expression levelin EcoBrowser. A recent paper identified 10 new non-coding sRNAs of E.coli by using a genome-wide deep-sequencing approach, 9 of them display a clear high expression level in EcoBrowser (details in supplementary, additional file 1) [18]. Thus, biologists can use EcoBrowser as a reference before the experimental validation of a new sRNA candidate. We have collected the predicted sRNA results of E.coli from several papers to help users make use of the browser more effectively [1923]. The prediction information is in "Help" page.


The EcoBrowser is a valuable tool for researchers. With the help of the integrated genome browser, users can also get a systemic view both from the vertical and parallel sides, as well as inspirations for the design of new experiments which will expand our understanding of the regulation mechanism. Next generation datasets, such as RNA-seq, will also be included in the future when the next generation sequencing technologies have been extensively applied.

Availability and requirements

Project name: EcoBrowser project

Project home page:

Operating systems: Platform independent

Programming language: Javascript, CSS, CGI

Other requirements: None



This work was supported by grant State key basic research program (973):2010CB910200, 2010CB529200; Research Program of CAS:KSCX2-YW-R-112

Authors’ Affiliations

Bioinformatics Center, Key Lab of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai, 200031, China
Shanghai Center for Bioinformation Technology, 100 Qinzhou Road, Shanghai, 200235, China
College of life science and biotechnology, Shanghai Jiaotong University, Shanghai, 200120, China
Zilkha Neurogenetic Institute, Department of Psychiatry and Preventive Medicine, University of Southern California, Los Angeles, California 90089, USA


  1. MacLean D, Jones JD, Studholme DJ: Application of 'next-generation' sequencing technologies to microbial genetics. Nat Rev Microbiol. 2009, 7 (4): 287-296.PubMedGoogle Scholar
  2. Blattner FR, Plunkett G, Bloch CA, Perna NT, Burland V, Riley M, Collado-Vides J, Glasner JD, Rode CK, Mayhew GF: The complete genome sequence of Escherichia coli K-12. Science. 1997, 277 (5331): 1453-1462. 10.1126/science.277.5331.1453.PubMedView ArticleGoogle Scholar
  3. Rudd KE: EcoGene: a genome sequence database for Escherichia coli K-12. Nucleic Acids Res. 2000, 28 (1): 60-64. 10.1093/nar/28.1.60.PubMedPubMed CentralView ArticleGoogle Scholar
  4. Keseler IM, Collado-Vides J, Santos-Zavaleta A, Peralta-Gil M, Gama-Castro S, Muniz-Rascado L, Bonavides-Martinez C, Paley S, Krummenacker M, Altman T: EcoCyc: a comprehensive database of Escherichia coli biology. Nucleic Acids Res. 2011, D583-590. 39 DatabaseGoogle Scholar
  5. EcoliWiki. []
  6. GenExpDB. []
  7. Kampa D, Cheng J, Kapranov P, Yamanaka M, Brubaker S, Cawley S, Drenkow J, Piccolboni A, Bekiranov S, Helt G: Novel RNAs identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22. Genome Res. 2004, 14 (3): 331-342. 10.1101/gr.2094104.PubMedPubMed CentralView ArticleGoogle Scholar
  8. Cho BK, Zengler K, Qiu Y, Park YS, Knight EM, Barrett CL, Gao Y, Palsson BO: The transcription unit architecture of the Escherichia coli genome. Nat Biotechnol. 2009, 27 (11): 1043-1049. 10.1038/nbt.1582.PubMedView ArticleGoogle Scholar
  9. UCSC Geonme Browser. []
  10. EnsemblBacteria. []
  11. Skinner ME, Uzilov AV, Stein LD, Mungall CJ, Holmes IH: JBrowse: a next-generation genome browser. Genome Res. 2009, 19 (9): 1630-1638. 10.1101/gr.094607.109.PubMedPubMed CentralView ArticleGoogle Scholar
  12. Sayers EW, Barrett T, Benson DA, Bolton E, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Federhen S: Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2011, D38-51. 39 DatabaseGoogle Scholar
  13. Thomassen GO, Weel-Sneve R, Rowe AD, Booth JA, Lindvall JM, Lagesen K, Kristiansen KI, Bjoras M, Rognes T: Tiling array analysis of UV treated Escherichia coli predicts novel differentially expressed small peptides. PLoS One. 2010, 5 (12): e15356-10.1371/journal.pone.0015356.PubMedPubMed CentralView ArticleGoogle Scholar
  14. Thomassen GO, Rowe AD, Lagesen K, Lindvall JM, Rognes T: Custom design and analysis of high-density oligonucleotide bacterial tiling microarrays. PLoS One. 2009, 4 (6): e5943-10.1371/journal.pone.0005943.PubMedPubMed CentralView ArticleGoogle Scholar
  15. Mooney RA, Davis SE, Peters JM, Rowland JL, Ansari AZ, Landick R: Regulator trafficking on bacterial transcription units in vivo. Mol Cell. 2009, 33 (1): 97-108. 10.1016/j.molcel.2008.12.021.PubMedPubMed CentralView ArticleGoogle Scholar
  16. Peters JM, Mooney RA, Kuan PF, Rowland JL, Keles S, Landick R: Rho directs widespread termination of intragenic and stable RNA transcription. Proc Natl Acad Sci USA. 2009, 106 (36): 15406-15411. 10.1073/pnas.0903846106.PubMedPubMed CentralView ArticleGoogle Scholar
  17. Skinner ME, Holmes IH: Setting up the JBrowse genome browser. Curr Protoc Bioinformatics. 2010, Chapter 9: Unit 9 13Google Scholar
  18. Raghavan R, Groisman EA, Ochman H: Genome-wide detection of novel regulatory RNAs in E. coli. Genome Res. 2011Google Scholar
  19. Argaman L, Hershberg R, Vogel J, Bejerano G, Wagner EG, Margalit H, Altuvia S: Novel small RNA-encoding genes in the intergenic regions of Escherichia coli. Curr Biol. 2001, 11 (12): 941-950. 10.1016/S0960-9822(01)00270-6.PubMedView ArticleGoogle Scholar
  20. Rivas E, Klein RJ, Jones TA, Eddy SR: Computational identification of noncoding RNAs in E. coli by comparative genomics. Curr Biol. 2001, 11 (17): 1369-1373. 10.1016/S0960-9822(01)00401-8.PubMedView ArticleGoogle Scholar
  21. Chen S, Lesnik EA, Hall TA, Sampath R, Griffey RH, Ecker DJ, Blyn LB: A bioinformatics based approach to discover small RNA genes in the Escherichia coli genome. Biosystems. 2002, 65 (2-3): 157-177. 10.1016/S0303-2647(02)00013-8.PubMedView ArticleGoogle Scholar
  22. Yachie N, Numata K, Saito R, Kanai A, Tomita M: Prediction of non-coding and antisense RNA genes in Escherichia coli with Gapped Markov Model. Gene. 2006, 372: 171-181.PubMedView ArticleGoogle Scholar
  23. Tran TT, Zhou F, Marshburn S, Stead M, Kushner SR, Xu Y: De novo computational prediction of non-coding RNA genes in prokaryotic genomes. Bioinformatics. 2009, 25 (22): 2897-2905. 10.1093/bioinformatics/btp537.PubMedPubMed CentralView ArticleGoogle Scholar


© Li et al; licensee BioMed Central Ltd. 2011

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.