TGIF-DB: terse genomics interface for developing botany
BMC Research Notes volume 14, Article number: 181 (2021)
Pearl millet (Pennisetum glaucum) is a staple cereal crop for semi-arid regions. Its whole genome sequence and deduced putative gene sequences are available. However, the functions of many pearl millet genes are unknown. Situations are similar for other crop species such as garden asparagus (Asparagus officinalis), chickpea (Cicer arietinum) and Tartary buckwheat (Fagopyrum tataricum). The objective of the data presented here was to improve functional annotations of genes of pearl millet, garden asparagus, chickpea and Tartary buckwheat with gene annotations of model plants, to systematically provide such annotations as well as their sequences on a website, and thereby to promote genomics for those crops.
Sequences of genomes and transcripts of pearl millet, garden asparagus, chickpea and Tartary buckwheat were downloaded from a public database. These transcripts were associated with functional annotations of their Arabidopsis thaliana and rice (Oryza sativa) counterparts identified by BLASTX. Conserved domains in protein sequences of those species were identified by the HMMER scan with the Pfam database. The resulting data was deposited in the figshare repository and can be browsed on the Terse Genomics Interface for Developing Botany (TGIF-DB) website (http://webpark2116.sakura.ne.jp/rlgpr/).
Pearl millet (Pennisetum glaucum) is a staple cereal crop for semi-arid regions. Its whole genome was sequenced and putative gene sequences were deduced . Functions of some of the pearl millet genes have also been either examined by experiments or predicted on the basis of their homologies to specific, targeted gene sets with known functions ([2, 3]; for example). However, functional annotations of the pearl millet genes are neither sufficient nor systematic. Situations are similar in many other plant species such as garden asparagus (Asparagus officinalis), chickpea (Cicer arietinum) and Tartary buckwheat (Fagopyrum tataricum) [4,5,6 respectively, for analyses of their genomes]. Arabidopsis thaliana and rice (Oryza sativa) are dicot and monocot model species, respectively, and have better functional annotations for each gene ([7, 8]; for example). The objective of the data presented here was to improve the functional annotations of genes of pearl millet by systematic homology searches with databases for Arabidopsis genes, rice genes and protein conserved domains, to develop a platform for browsing the resulting data, and thereby to promote pearl millet genomics.
The whole genome sequences and transcript (or protein coding) sequences that were deduced from the genome sequences of pearl millet, garden asparagus, chickpea and Tartary buckwheat as well as genome annotation files in the general feature format (GFF) were downloaded from the International Pearl Millet Genome Sequencing Consortium (IPMGSC) website , the Asparagus Genome Project website , the National Center for Biotechnology Information (NCBI) Chickpea Genome website (with Genome ID 2992)  and the MBKBASE Tartary Buckwheat Genome Project website , respectively. The sequences and functional annotations of Arabidopsis proteins (TAIR10 versions) were downloaded from The Arabidopsis Information Resource (TAIR) website , and those of rice (RGAP 7 versions) were downloaded from the Rice Genome Annotation Project (RGAP) website . BLASTX on the BLAST + suite  was performed with the transcript sequences of those crop species as queries and with either the Arabidopsis protein sequences or rice protein sequences as the database. The threshold E-value was set as 1e − 20, which is more stringent than the default value (10.0), for this analysis. The transcripts (or genes) of the crop species were then associated with the functional annotations of the corresponding Arabidopsis and rice proteins identified by the BLASTX search. Protein sequences of pearl millet, garden asparagus, chickpea and Tartary buckwheat were deduced from their transcript sequences, and the Pfam database  was searched by the hmmscan program for HMMER (version 3.3)  to identify conserved domains in those proteins. The threshold E-value was set as 1e − 5, which is more stringent than the default value (10), for this analysis. A genomic locus sequence, which consists of exons and introns, and a promoter sequence, which is a 3-kb upstream sequence from the start codon, for each gene were extracted on the basis of the whole genome sequences and the GFF files. The resulting data for the gene sequences and their functional annotations for pearl millet, garden asparagus, chickpea and Tartary buckwheat were deposited in the figshare repository (Data sets 1–34 in Table 1) . A website, Terse Genomics Interface for Developing Botany (TGIF-DB), was developed to browse these data  (see Data file 1 in Table 1 for a TGIF-DB interface). The programs in the BLAST+ suite  and the genome browser JBrowse  were included as a part of TGIF-DB.
Some proteins of the species used do not appear to have conserved domains and/or any close homolog in either Arabidopsis or rice.
Availability of data and materials
The datasets generated during and/or analysed during the current study are available in the figshare repository, https://doi.org/10.6084/m9.figshare.13565168.v2 . The data can be browsed on the TGIF-DB website, http://webpark2116.sakura.ne.jp/rlgpr/ . Please see Table 1 and references [10, 11] for details and links to the data.
General feature format
Varshney RK, Shi C, Thudi M, Mariac C, Wallace J, Qi P, et al. Pearl millet genome sequence provides a resource to improve agronomic traits in arid environments. Nat Biotechnol. 2017;35:969–76.
Shinde H, Dudhate A, Tsugama D, Gupta SK, Liu S, Takano T. Pearl millet stress-responsive NAC transcription factor PgNAC21 enhances salinity stress tolerance in Arabidopsis. Plant Physiol Biochem. 2019;135:546–53.
Chanwala J, Satpati S, Dixit A, Parida A, Giri MK, Dey N. Genome-wide identification and expression analysis of WRKY transcription factors in pearl millet (Pennisetum glaucum) under dehydration and salinity stress. BMC Genomics. 2020;21:231.
Harkess A, Zhou J, Xu C, Bowers JE, Van der Hulst R, Ayyampalayam S, et al. The asparagus genome sheds light on the origin and evolution of a young Y chromosome. Nat Commun. 2017;8:1279.
Varshney RK, Song C, Saxena RK, Azam S, Yu S, Sharpe AG, et al. Draft genome sequence of chickpea (Cicer arietinum) provides a resource for trait improvement. Nat Biotechnol. 2013;31:240–6.
Zhang L, Li X, Ma B, Gao Q, Du H, Han Y, et al. The Tartary buckwheat genome provides insights into rutin biosynthesis and abiotic stress tolerance. Mol Plant. 2017;10:1224–37.
Berardini TZ, Reiser L, Li D, Mezheritsky Y, Muller R, Strait E, Huala E. The Arabidopsis information resource: making and mining the “gold standard” annotated reference plant genome. Genesis. 2015;53:474–85.
Kawahara Y, de la Bastide M, Hamilton JP, Kanamori H, McCombie WR, Ouyang S, et al. Improvement of the Oryza sativa Nipponbare reference genome using next generation sequence and optical map data. Rice. 2013;6:4.
Varshney RK, Shi C, Thudi M, Mariac C, Wallace J, Qi P, et al. International pearl millet genome sequencing consortium (IPMGSC). 2021. http://cegsb.icrisat.org/ipmgsc. Accessed 14 Jan 2021.
Asparagus Genome Project. 2021. http://asparagus.uga.edu/tripal. Accessed 16 Mar 2021.
NCBI genome for Cicer arietinum (chickpea). 2021. https://www.ncbi.nlm.nih.gov/genome/2992. Accessed 16 Mar 2021.
MBKBASE, introduction to tartary buckwheat genome project. 2021. http://mbkbase.org/Pinku1. Accessed 16 Mar 2021.
Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. BLAST+: architecture and applications. BMC Bioinform. 2009;10:421.
El-Gebali S, Mistry J, Bateman A, Eddy SR, Luciani A, Potter SC, et al. The Pfam protein families database in 2019. Nucleic Acids Res. 2019;47:D427-32.
Mistry J, Finn RD, Eddy SR, Bateman A, Punta M. Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions. Nucleic Acids Res. 2013;41:e121.
Tsugama D. TGIF-DB datasets. figshare. 2021. https://doi.org/10.6084/m9.figshare.13565168.v5.
Tsugama D. TGIF-DB. http://webpark2116.sakura.ne.jp/rlgpr. Accessed 14 Jan 2021.
Buels R, Yao E, Diesh CM, Hayes RD, Munoz-Torres M, Helt G, et al. JBrowse: a dynamic web platform for genome visualization and analysis. Genome Biol. 2016;17:66.
The authors greatly appreciate data and advice from Dr. Shashi Kumar Gupta and his colleagues in International Crops Research Institute for the Semi-Arid Tropics (ICRISAT). The authors thank their colleagues to test former versions of TGIF-DB.
This work was supported by JSPS (Japan Society for the Promotion of Science) Kakenhi Grant (Grant Number: 19KK0155 and 19K15827).
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Tsugama, D., Takano, T. TGIF-DB: terse genomics interface for developing botany. BMC Res Notes 14, 181 (2021). https://doi.org/10.1186/s13104-021-05599-4
- Garden asparagus
- Pearl millet
- Tartary buckwheat