Skip to main content

Whole genome analysis of Black Bengal goat from Savar Goat Farm, Bangladesh

Abstract

Objectives

Single nucleotide polymorphisms (SNPs) play critical roles in genetic diversity and disease. Many traits and diseases are linked with exonic SNPs that are significant for gene function, regulation or translation. This study focuses on SNPs that potentially act as the genetic basis for desirable traits in the Black Bengal Goat. This variety of goat is native to South Asia, and is identified as one of the most commercially important meat producing animals in the world. The aim of this study was to sequence the genome of Black Bengal Goats and identify SNPs that might play a significant role in determining meat quality in the organism. The study focuses on exonic SNPs for their greater likelihood of affecting the final translated protein product.

Results

Approximately 76,000 exonic variants were identified in the study. After filtration using a Wilcoxon test based score, the number came down to 49, 965 which were found to be distributed in 11,568 genes. The functional pathways affected by these variations included fatty acid metabolism and degradation, which are important processes that influence meat quality.

Introduction

The Black Bengal goat (Capra hircus) is a regional breed found in Bangladesh and eastern India. It is characterized by its smaller stature and is preferred in the livestock industry for higher reproductive rate and quality of meat and skin [1]. Meat quality in goats is measured by several indicators, the most important of which is the degree of tenderness. Previous studies have investigated the correlation between tenderization of meat and the expression of various genes [2]. The goal of this study was to identify specific genetic markers that may correlate to improved meat quality observed in the Black Bengal goat using a whole genome sequencing (WGS) approach. Variant calling and single nucleotide polymorphism (SNP) profiling was carried out on the sequenced data through comparison with previously shortlisted key genes so as to identify those variants that may affect these candidate genes and possibly confer changes that lead to desirable characteristics in the organism. The variants were also annotated to determine their downstream effect on transcriptional and translational processes. The rationale of this study was to identify SNPs, either unique changes or differences in the number of such variants, in various genes that may help in better understanding and characterizing the factors that determine meat quality in goats.

Main text

Methods

Blood sample from a female Black Bengal goat aged 4 years was collected from Savar Goat Farm. The blood samples were transported to the Department of Microbiology and Hygiene, Bangladesh Agricultural University (BAU) for DNA extraction and sample processing. All goats were handled following standard guidelines of Animal Welfare and Experimental Ethical Committee (AWEEC) of BAU [approval number AWEEC/BAU/2019 (13)].

DNA from the blood sample was extracted using QIAamp® DNA Blood Mini kit (QIAGEN, Germany) following manufacturer’s instructions. DNA purity was evaluated by NanoDrop 1000 Spectrophotometer (Life Technologies, CA, USA) and 0.8% agarose gel electrophoresis. Sequencing was carried out at the Genewiz Sequencing Centre in Suzhou, China using an Illumina HiSeq platform that produced 1,151,332 trimmed reads (phred quality > 30) and a 109-fold mean coverage.

For genome mapping ARS1 (accession no. GCA_001704415.1) was used as the reference genome and the MEM algorithm of Burrows–Wheeler Aligner (BWA) [3] was selected with the minimum mapping quality set to 20 and unmapped reads were filtered out using SAMtools [4]. Following this, read groups were added and PCR duplicates were marked using Picard (http://broadinstitute.github.io/picard/). For SNP profiling and variant calling, a modified version of the Genome Analysis Toolkit (GATK) best practices pipeline was followed (DePristo et al. [9]), Base recalibration was performed with BaseRecalibrator, and finally variants were called with HaplotypeCaller in Genomic Variant Call Format (GVCF) mode. Variants were further analysed on HaplotypeCaller based on several statistical parameters including QualByDepth (QD), Wilcoxon test scores and fisher test score for strand bias.

The variants were annotated with ANNOVAR (ANNOate VARiation) [5], with an inbuilt annotation database using gene annotation data from UCSC (University of Santa Cruz) Genome Browser [6]. In the next step, gene variants that have been previously implicated in meat tenderization were analyzed. Variants which passed the BaseQRankSumTest filter, a Mann–Whitney-Wilcoxon U-test were considered for further analysis. Filtration was carried out by the ReadPosRankSum values which are calculated based on the result of the BaseQRankSumTest for each variant. The Manhattan plot was generated using qqman package available on R [7]. Further functional annotation and pathway analysis of the final RNA editing sites was carried using the Database for Annotation, Visualization and Integrated Discovery (DAVID) [8].

Results

Genome variants belonging only to the exonic regions as per ANNOVAR’s algorithm [5], were considered to look into the functional regions of the genome. Exonic variants are more likely to affect the final protein product, possibly through alterations of gene function, expression, translation or even post-translational modifications. Following this filtration, 76,000 exonic variants were found and the number was reduced further to 49,965 after a second filtration based on Wilcoxon test scores as calculated by the BaseQSum calculator functionality of GATK’s variant annotator [9].

These 49,965 exonic variant sites were distributed in a total of 11,568 genes. These variants were categorized according to their effect on translation, i.e. synonymous, non-synonymous, stop-gain, frame-shift or unknown. To statistically analyze the association study, the u-based z approximation derived from the rank-sum test was used instead of the more conventional p-values.

GATK’s variant annotation functionalities were further used to calculate these values which are outputted as the ReadPosRankSum values. Negative values were omitted owing to those representing presence of the alternate alleles near the ends of reads which are often erroneous. The positive values were used to plot the frequency of variants per chromosome and used to speculate on possible chromosome-wise association between the variants and possible desired traits (Fig. 1).

Fig. 1
figure 1

Manhattan plot of U-based Z approximation scores of SNPs per chromosome. The graph shows SNPs and their associated ReadPosRankSum values for each chromosome

Previous studies have shown the association of certain gene products with the meat tenderization process, including calpain-1 (CAPN1), calpain-2 (CAPN2), calpastatin (CAST), caspase 3 (CASP3), caspase 9 (CASP9), αB-crystallin (CRYAB), heat shock protein 27 (Hsp27), heat shock protein 40 (Hsp40) and heat shock protein 70 (Hsp70) (Saccà et al., 2019). Among these, this study found variants in Hsp70, CASP3, CAPN1, CAPN2 and CAST encoding genes. Table 1 shows the positions and types of variants found in these genes.

Table 1 Variants found in genes previously implicated in meat tenderization process

The diversity among nonsynonymous substitutions is significantly lower than among synonymous substitutions since they are subject to higher selective pressures [10]. Further analysis of non-synonymous variants affecting the encoded proteins revealed a total of 20,879 non-synonymous variants that were distributed in 6850 genes. The gene ontology analysis provided some promising findings. In particular, the functional clustering analysis showed that the processes affected by the identified genes were fatty acid metabolism and fatty acid degradation. Both of these processes may hold significant contributions to the leanness and eventual quality of meat produced from the concerned specimen [11]. The third pathway with the most significant scores from the list of genes was the PPAR pathway (peroxisome proliferative-activated receptor) which also plays role in fatty acid metabolism. Table 2 shows these results alongside the associated statistical scores as calculated by DAVID’s in-built algorithm.

Table 2 Pathways affected by the identified SNPs and their associated statistical scores

Discussion

The main objective of the study was to systematically identify genomic variants underlying major phenotypic traits in Black Bengal goat obtained via whole genome-sequencing data.

A total of 11,564 genes were identified from a list of 47,241 exonic variants. Among these a total of 20,879 non-synonymous mutations were found in 6850 genes. The search for exonic variants were narrowed down to non-synonymous sites since they exert direct effect on the translated protein sequence and thus often mark polymorphisms that contribute to greater fitness and virulence potentials in the organism.

This study provides evidence regarding the significance of certain gene variants in influencing beneficial characteristics of the Black Bengal goat. Among these variants, two contained non-synonymous changes that led to alterations in the encoded amino acid. CAST contained a variant that would lead to an arginine being substituted for lysine at position 28. CAPN2 contained a change that would result in an aspartic acid being substituted by asparagine at position 508. In addition several variants in HSP70 protein were also found, along with considerable numbers of more in other HSP proteins. The HSP family has previously been heavily linked with the process of meat tenderization [2]. The lipid metabolism pathways being highlighted by the ontology analysis is another indication of specific variants to the observable characteristics of these organisms [11].

In conclusion, we would like to report this study as a preliminary insight into the possible genetic basis of improved meat quality in Black Bengal goats. The goat remains an important organism in the livestock industry and hence its understanding and improvement hold significant economic value. Further genomic characterization and analysis of this breed of goat with substantially larger sample size would be able to verify our initial findings and add new evidence that would allow a better discerning of the unique variants that might confer this species its desirable traits.

Limitations

In a number of previous studies association with particular gene/s and trait has been investigated and we investigated the association between such known genes and particular trait of Black Bengal Goat such as meat tenderization process using only one sample as carried out in previous studies [12, 13].

We have also carried out de novo identification of genes and their ontology although looking into more genomes might have provided greater insight and more statistical power. Although only one sample was used we took rigorous statistical approach to filter out spurious variants. However, this initial finding is also extremely important looking into the cost of sequencing technologies and in the context of a developing country like Bangladesh. Certainly further comparative genomics analysis using a larger number samples will be carried out based on these initial findings.

De novo genome assembly was not carried out due to lack of enough computational power in terms of RAM (random accessory memory) and this could have provided information on the identification of gene gain/loss compared to other goat genomes available in genome databases.

This is also to be noted that the outcome of this particular study using only one genome will certainly provide confidence to the scientific community in this region for in detailed genomics analysis of large eukaryotic genomes and facilitate further genomic research in resource poor settings like Bangladesh.

Availability of data and materials

The sequenced genome has been submitted to NCBI and the data has been made publicly available (SRA ID: SRR8928054, BioProject ID: PRJNA53351, BioSample ID: SAMN11458493).

Abbreviations

ANNOVAR:

ANNOtate VARiation

AWEEC:

Animal Welfare and Experimental Ethical Committee

BAU:

Bangladesh Agricultural University

BARC:

Bangladesh Agricultural Research Council

BWA:

Burrows–Wheeler Aligner

DAVID:

Database for Annotation, Visualization and Integrated Discovery

GATK:

Genome Analysis Toolkit

GVCF:

Genomic Variant Call Format

PPAR:

peroxisome proliferative-activated receptor

QD:

QualByDepth

RAM:

random accessory memory

SNP:

single nucleotide polymorphism

UCSC:

University of Santa Cruz

WGS:

whole genome sequencing

References

  1. Rout PK, Joshi MB, Mandal A, Laloe D, Singh L, Thangaraj K. Microsatellite-based phylogeny of Indian domestic goats. BMC Genet. 2008;9:11.

    Article  Google Scholar 

  2. Saccà E, Corazzin M, Bovolenta S, Piasentier E. Meat quality traits and the expression of tenderness-related genes in the loins of young goats at different ages. Animal. 2019;13(10):2419–28.

    Article  Google Scholar 

  3. Li H, Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics. 2009;25(14):1754–60.

    Article  CAS  Google Scholar 

  4. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25(16):2078–9.

    Article  Google Scholar 

  5. Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010;38(16):e164.

    Article  Google Scholar 

  6. Karolchik D, Hinrichs AS, Furey TS, Roskin KM, Sugnet CW, Haussler D, et al. The UCSC Table Browser data retrieval tool. Nucleic Acids Res. 2004;32:493–6.

    Article  Google Scholar 

  7. Turner SD. qqman: an R package for visualizing GWAS results using Q–Q and manhattan plots. J Open Source Softw. 2018;3:25.

    Article  Google Scholar 

  8. Huang DW, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009;4(1):44–57.

    Article  CAS  Google Scholar 

  9. Depristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 2011;43(5):491–501.

    Article  CAS  Google Scholar 

  10. Ohta T. Synonymous and nonsynonymous substitutions in mammalian genes and the nearly neutral theory. J Mol Evol. 1995;40:56–63.

    Article  CAS  Google Scholar 

  11. Goetsch AL, Merkel RC, Gipson TA. Factors affecting goat meat production and quality. Small Rumin Res. 2011;101:173–81.

    Article  Google Scholar 

  12. Bickhart DM, Rosen BD, Koren S, Sayre BL, Hastie AR, Chan S, et al. Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome. Nat Genet. 2017;49(4):643–50.

    Article  CAS  Google Scholar 

  13. Dong Y, Xie M, Jiang Y, Xiao N, Du X, Zhang W, et al. Sequencing and automated whole-genome optical mapping of the genome of a domestic goat (Capra hircus). Nat Biotechnol. 2013;31(2):135–41. https://doi.org/10.1038/nbt.2478.

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Savar Goat Farm, Savar, Dhaka for providing Black Bengal goat for sample collection.

Funding

This work was supported by the Bangladesh Agricultural Research Council (BARC), Farmgate, Dhaka, Bangladesh. The funding supported the sequencing experiments, and data analysis.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: SMZHC, KHMNHN, AK, MM, MRI, MH; Fund acquisition: SMZHC; Sample collection: AK, MMM, MRI; Investigation: SMZHC, KHMNHN; Experimentation: KHMNHN, AK, MMM, MRI; Resources: KHMNHN, MH; Supervision: SMZHC, KHMNHN, MH; Computational and Statistical Analysis: SH, MR, TT, TA, MH; Original Draft Preparation: SMZHC, KHMNHN, SH, AK, MMM, MRI; Reviewing and Editing: SMZHC, KHMNHN, AR, MH. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Shah Md. Ziqrul Haq Chowdhury, K. H. M. Nazmul Hussain Nazir or Maqsud Hossain.

Ethics declarations

Ethics approval and consent to participate

All goats were handled by following the standard guidelines of the AWEEC of BAU [Approval Number AWEEC/BAU/2019(13)].

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chowdhury, S.M.Z.H., Nazir, K.H.M.N.H., Hasan, S. et al. Whole genome analysis of Black Bengal goat from Savar Goat Farm, Bangladesh. BMC Res Notes 12, 687 (2019). https://doi.org/10.1186/s13104-019-4700-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13104-019-4700-7

Keywords