Genome-wide linkage search for cancer susceptibility loci in a cohort of non BRCA1/2 families in Sri Lanka
BMC Research Notes volume 15, Article number: 190 (2022)
Although linkage studies have been utilized for the identification of variants associated with cancer in the world, little is known about their role in non BRCA1/2 individuals in the Sri Lankans. Hence we performed linkage analysis to identify susceptibility loci related to the inherited risk of cancer in a cohort of Sri Lankans affected with hereditary breast cancer. The Illumina global screening array having 654,027 single nucleotide polymorphism markers was performed in four families, in which at least three individuals within third degree relatives were affected by breast cancer. Two-point parametric linkage analysis was conducted assuming disease allele frequency of 1%. Penetrance was set at 90% for carriers with a 10% phenocopy rate.
Thirty-one variants exhibited genome-wide suggestive HLODs. The top overall HLOD score was at rs1856277, an intronic variant in MYO16 on chromosome 13. The two most informative families also suggested several candidate linked loci in genes, including ERAP1, RPRM, WWOX, CDH1, EXOC1, HUS1B, STIM1 and TUSC1. This study provides the first step in identifying germline variants that may be involved in risk of cancer in cancer-aggregated non-BRCA1/2 families from the understudied Sri Lankan population. Several candidate linked regions showed suggestive evidence of linkage to cancer risk.
Inheritance of cancers among individuals in high risk families can be explained by significant familial aggregation of high, moderate or low penetrance genetic variants in cancer predisposing genes (CPGs) that are transmitted down the generations in each family . Breast cancer has become one of the leading causes of deaths, worldwide . It is estimated that 5–10% of breast cancer patients have a hereditary predisposition and are harboring germline high, moderate or low risk variants in CPGs . Many studies have revealed that a significant proportion of families with many affected cases are not associated with variants in known CPGs such as BRCA1 and BRCA2 . However, families negative after breast cancer diagnostics rarely fulfill breast cancer screening criteria, mostly because of a later onset or reduced penetrance . It is also possible that there are further loci conferring more substantial risk that could be detected. In such instances Genome Wide Association Studies (GWAS) have been used to find common genetic variants associated with individually small but additive risk to develop breast cancer in families that are unlikely to be segregating BRCA1 and BRCA2 pathogenic variants . So far identified cancer susceptibility genes can only explain up to 5% of all cases, while familial clustering is seen in other cancer affected cases who have been identified as variant negative . There is however, a dearth in the knowledge and understanding of the genes that are responsible for the variant negative affected cases who exhibit evidence of hereditary cancer predisposition among their family members in the Sri Lankan population. This deficiency in knowledge has also resulted in sub optimal management of individuals who are at risk of inherited cancer syndromes.
This is the first linkage analysis study conducted in the families affected with cancer in the Sri Lankan population. A genome wide linkage (GWL) scan was performed using data from 48 individuals from 4 cancer families, aiming to evaluate the possibility of identifying susceptibility loci conferring breast and other cancer predisposition.
Materials and methods
Index cases recruited into this study were women affected with breast cancer who had a family history of breast cancer but who have been identified as having no variants in CPGs (Additional file 1: Figures S1–S4). They were also negative for multiplex ligation-dependent probe amplification assay (MLPA). Three of the 4 index cases in the families studied had an age at diagnosis of breast cancer of less than 50 years (Additional file 2: Table S1). Each index case also had at least 2 relatives who were also affected with breast cancer. In two of the families, multiple family members also were affected with cancer at other sites as well. A total of 21 family members were diagnosed with cancer in addition to the index cases. The index cases and 44 of their affected and unaffected relatives enrolled in the study and provided biospecimens for genotyping (Additional file 2: Table S2).
Genotyping of 48 individuals was performed at the Australian Genomic Research Facility (AGRF) (Melbourne, Australia) using the Illumina Global Screening Array which has 654,027 single nucleotide polymorphism (SNP) markers on the array. Quality assessment of the samples was performed by QuantiFluor. The genome-wide content was selected for high imputation accuracy at minor allele frequencies of > 1% across all 26 1000 genomes project populations.
The software program PLINK  was used to perform quality control on the data. We removed all monomorphic variants and variants that were not genotyped in at least 95% of the subjects. Variants with Mendelian inconsistencies were removed from the offending family. Identity-by-descent (IBD) calculations were used to confirm all familial relationships within the four pedigrees. The final dataset contained 236,142 total variants for 69 individuals across the four families, 44 of which had genotype data. Out of the 44 genotyped individuals, 18 were affected with cancer.
Parametric linkage analysis
We performed two-point parametric linkage analysis on this data using Merlin , which utilizes the well-known Lander-Green algorithm to calculate linkage. We assumed an autosomal dominant model (mode of inheritance was inferred from the pedigrees) with a disease allele frequency of 1%. Penetrance was set at 90% for carriers with a 10% phenocopy rate. LOD scores were calculated for each of the four individual families and heterogeneity LOD (HLOD) scores were calculated across families. All variants were annotated using wANNOVAR [10, 11]
Thirty-one variants exhibited genome-wide suggestive HLODs (Fig. 1, Table 1). The top overall HLOD score was at rs1856277, an intronic variant in MYO16 on chromosome 13. There were 13 HLOD scores greater than 2.00. Family 1 was not particularly informative by itself. There was no individual LOD score > 0.54 and nearly every chromosome had a LOD score of that value (Additional file 2: Table S3, Additional file 1: Figure S5). Thus, nothing of particular interest could be gleaned from this family. The highest LOD score in family 2 was 1.397 at rs12616962, an intronic variant in KCNJ3 on chromosome 2 (Additional file 2: Table S4, Additional file 1: Figure S6). Chromosome 2 had multiple high LOD scores, in fact the top eight LOD scores in this family were on chromosome 2. However, there were multiple chromosomes (6; 13; 20; 22) that had LOD scores around 1.3. The most intriguing of these results was a peak on chromosome 6 from 382,507 to 650,645 bp (Fig. 2). This peak has almost no negative signal across the region. Most of the very positive LOD scores on chromosome 6 in this family occur in this LOD score peak region and there are no LOD scores more negative than − 0.09 in this region (positive and negative LOD scores very close to zero indicate marker loci with no information content for linkage). That is a hallmark of true linkage—a long stretch of positive LOD scores with no negative LOD scores. While there are other long stretches of positive LOD scores for this family, all of them have lots of negative LOD scores within those same stretches. Hence, we can’t rule those regions out but can say that this region on chromosome 6 can be identified as a better candidate region to contain a high-risk genetic variant for cancer susceptibility in this family. Family 3 had the highest LOD scores of any of the individual families (Additional file 2: Table S5, Additional file 1: Figure S7). There were three main peaks. The highest peak was on chromosome 9, which had two SNPs with LOD scores approximately equal to 1.8. These are rs1925508, an intergenic SNP between IZUMO3 and TUSC1, and rs10812758 an intronic SNP in LINGO2. The second peak was on chromosome 11 which had two SNPs with LOD scores of approximately 1.7. These SNPs are rs11825543, an intronic variant in STIM1 and rs2071461, an exonic variant in CSNK2A3. The last peak was on chromosome 16, which had three SNPs with LOD scores of 1.59. All three SNPs were intergenic variants between MAF and MAFTRR. WES data were not available for any individuals in this family so the candidate regions could not be interrogated further. Much like family 1, family 4 was not informative on its own. There were over 2,000 SNPs with LOD scores between 0.76 and 0.70 across 18 autosomes (Additional file 2: Table S6, Additional file 1: Figure S8).
The present linkage study performed across 4 Sri Lankan non-BRCA1/2 families ascertained due to a family history of breast cancer, resulted in suggestive evidence for linkage to cancer risk at candidate regions on chromosomes 2, 6, 13 and 22. These results suggest the presence of several putative loci for risk of breast or other cancer. These results suggested that heterogeneity among families could mask linkage signals, especially when the number of families is small. Importantly, two of the families had only breast cancer patients while the two most informative families for linkage had multiple family members also affected with other cancers. Thus, it is not surprising that several different candidate regions are identified in this analysis of all-cancer susceptibility. Looking at the regions with high HLODs across the four families, we find variants in four genes that are particularly intriguing for cancer risk—ERAP1, RPRM, WWOX, and CDH1. Low expression of endoplasmic reticulum aminopeptidase 1 (ERAP1) gene has been associated with poor clinical outcomes of patients affected with triple negative breast carcinoma . Reprimo gene (RPRM) is a potential p53-dependent tumor suppressor gene . The RPRM gene has been found to be frequently hypermethylated in several human cancers . Loss of heterozygosity, homozygous deletions, and chromosomal translocations affecting WW domain containing oxidoreductase (WWOX) gene has been reported mainly in breast cancer but also including ovarian, esophageal, lung and stomach carcinoma, and multiple myeloma . An intragenic GSA-rs78740081 variant with a 2.0409 of HLOD score in the WWOX gene has been identified as a genome wide suggestively linked variant in this cohort. The region on chromosome 16 where E-cadherin (CDH1) gene is located is frequently associated with loss of heterozygosity and loss of tumour suppressor function in several cancers, including gastric , colorectal , breast  and ovarian . We have found a genome-wide suggestive linkage to an intronic variant in the 688,224.8 bp position in the chromosome where the CDH1 gene resides. Family 2 showed top linked variants in multiple genes that are interesting candidates for cancer susceptibility, including KCNJ3 and EXOC2.
Several studies on linkage analysis in non BRCA1/2 families have been conducted in other populations [20,21,22,23,24,25]. However, the fact that these findings do not replicate in other populations is not surprising given the uniqueness of this Sri Lankan data set.
There are several strengths and weaknesses of this study. In a heterogeneous disease like cancers, it would be unsurprising to find novel candidate genes and variants in different populations and in different families within a population. Family-based linkage studies such as this are able to utilize the long, linked haplotypes shared by closely related affected individuals, allowing for identification of linked chromosomal regions that may harbor causal variants that might not have been genotyped in this study. Variants that are rare in the general population may also be enriched in individual families ascertained for a strong family history of cancer, particularly early-onset cancers.
In conclusion, this study provides the first step in identifying germline causal variants that may be involved in risk of cancer in cancer-aggregated families from the understudied Sri Lankan population. Several candidate chromosomal regions showed suggestive evidence of linkage to cancer risk.
There are some limitations in this study as well. First, we used a microarray chip, meaning that there were large numbers of variants that were not genotyped. Thus, it is possible that we may have not identified a causal variant in this study, but more likely, a variant(s) that is linked to the casual variant. Even in family 2 where we were able to identify candidate exonic variants in the index case that were within the linked regions, it is possible that a variant that is in this linkage region but not covered by WES is the causal variant. Targeted sequencing of the candidate regions will be needed to elucidate the true causal variants. It is of course also possible that all of these linkage results are false positives because of the relatively small number of biospecimens available on affected family members. The number of patients and their family members who undergo WES testing is very low in Sri Lanka due to the high cost of the diagnostic testing. Out of the very few patients in Sri Lanka who underwent NGS testing, many were found to have at least one SNV in high, moderate and low risk cancer predisposing genes hence we had to exclude them from our study. Only the breast cancer probands who visited our clinics for genetic screening and had negative results in the testing of known cancer syndrome genes were invited to join this study and SNP genotyping was done for the family members who we recruited for the study. We plan to address these issues in future studies. Power can be improved by attempting to enroll more relatives within these four families and by adding more families to the study. We also plan additional studies using multipoint linkage techniques and plan to seek funding for additional genotyping and sequencing of informative family members. A detailed analysis of the phenotypic and clinical characteristics of this cohort in relation to the genotypic results is the subject of a future study.
Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Heterogeneity logarithm of odds
Logarithm of odds
Cancer predisposing genes
Single nucleotide polymorphism
Whole exome sequencing
Multiplex ligation probe dependent amplification assay
Global screening array
Jasperson KW, Tuohy TM, Neklason DW, Burt RW. Hereditary and familial colon cancer. Gastroenterology. 2010;138(6):2044–58. https://doi.org/10.1053/j.gastro.2010.01.054.
Azamjah N, Soltan-Zadeh Y, Zayeri F. Global trend of breast cancer mortality rate: a 25-year study. Asian Pac J Cancer Prev. 2019;20(7):2015–20. https://doi.org/10.31557/APJCP.2019.20.7.2015.
Fanale D, Incorvaia L, Filorizzo C, Bono M, Fiorino A, Calò V, Brando C, Corsini LR, Barraco N, Badalamenti G, Russo A, Bazan V. Detection of germline mutations in a cohort of 139 patients with bilateral breast cancer by multi-gene panel testing: impact of pathogenic variants in other genes beyond BRCA1/2. Cancers. 2020;12(9):2415. https://doi.org/10.3390/cancers12092415.
Salmi F, Maachi F, Tazzite A, Aboutaib R, Fekkak J, Azeddoug H, et al. Next-generation sequencing of BRCA1 and BRCA2 genes in Moroccan prostate cancer patients with positive family history. PLoS ONE. 2021;16(7): e0254101. https://doi.org/10.1371/journal.pone.0254101.
Skol AD, Sasaki MM, Onel K. The genetics of breast cancer risk in the post-genome era: thoughts on study design to move past BRCA and towards clinical relevance. Breast Cancer Res. 2016;18(1):99. https://doi.org/10.1186/s13058-016-0759-4.
Wang X, Pankratz VS, Fredericksen Z, Tarrell R, Karaus M, McGuffog L, et al. Common variants associated with breast cancer in genome-wide association studies are modifiers of breast cancer risk in BRCA1 and BRCA2 mutation carriers. Hum Mol Genet. 2010;19(14):2886–97. https://doi.org/10.1093/hmg/ddq174.
Shiovitz S, Korde LA. Genetics of breast cancer: a topic in evolution. Ann Oncol. 2015;26(7):1291–9. https://doi.org/10.1093/annonc/mdv022.
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81(3):559–75. https://doi.org/10.1086/519795.
Abecasis GR, Cherny SS, Cookson WO, Cardon LR. Merlin–rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet. 2002;30(1):97–101. https://doi.org/10.1038/ng786.
Yang H, Wang K. Genomic variant annotation and prioritization with ANNOVAR and wANNOVAR. Nat Protoc. 2015;10(10):1556–66. https://doi.org/10.1038/nprot.2015.105.
Chang X, Wang K. wANNOVAR: annotating genetic variants for personal genomes via the web. J Med Genet. 2012;49(7):433–6. https://doi.org/10.1136/jmedgenet-2012-100918.
Pedersen MH, Hood BL, Beck HC, Conrads TP, Ditzel HJ, Leth-Larsen R. Downregulation of antigen presentation-associated pathway proteins is linked to poor outcome in triple-negative breast cancer patient tumors. Oncoimmunology. 2017;6(5): e1305531. https://doi.org/10.1080/2162402X.2017.1305531.
Nishida N, Nagasaka T, Nishimura T, Ikai I, Boland CR, Goel A. Aberrant methylation of multiple tumor suppressor genes in aging liver, chronic hepatitis, and hepatocellular carcinoma. Hepatology. 2008;47(3):908–18. https://doi.org/10.1002/hep.22110.
Carolina B, Francisco A, Cynthia V, Macarena V, Ignacio D, Francisco J, et al. Reprimo as a potential biomarker for early detection in gastric cancer. Clin Cancer Res. 2008;14(19):6264–9. https://doi.org/10.1158/1078-0432.CCR-07-4522.
Ekizoglu S, Muslumanoglu M, Dalay N, Buyru N. Genetic alterations of the WWOX gene in breast cancer. Med Oncol. 2012;29(3):1529–35. https://doi.org/10.1007/s12032-011-0080-0.
Yakirevich E, Resnick MB. Pathology of gastric cancer and its precursor lesions. Gastroenterol Clin North Am. 2013;42(2):261–84. https://doi.org/10.1016/j.gtc.2013.01.004.
Chen X, Wang Y, Xia H, Wang Q, Jiang X, Lin Z, et al. Loss of E-cadherin promotes the growth, invasion and drug resistance of colorectal cancer cells and is associated with liver metastasis. Mol Biol Rep. 2012;39(6):6707–14. https://doi.org/10.1007/s11033-012-1494-2.
Caldeira JR, Prando EC, Quevedo FC, Neto FA, Rainho CA, Rogatto SR. CDH1 promoter hypermethylation and E-cadherin protein expression in infiltrating breast cancer. BMC Cancer. 2006;2(6):48. https://doi.org/10.1186/1471-2407-6-48.
Lu YJ, et al. The expressions of GM II, E-cadherin and α-catenin in ovarian epithelial tumor tissues and ovarian cancer cells with different abilities of metastasis. Tumor. 2012;32(11):899–906.
Fierheller C, Alenezi W, Tonin P. The genetic analyses of French Canadians of Quebec facilitate the characterization of new cancer predisposing genes implicated in hereditary breast and/or ovarian cancer syndrome families. Cancers. 2021;13(14):3406.
Arason A, Gunnarsson H, Johannesdottir G, Jonasson K, Bendahl P, Gillanders E, et al. Genome-wide search for breast cancer linkage in large Icelandic non-BRCA1/2 families. Breast Cancer Res. 2010. https://doi.org/10.1186/bcr2608.
Bergman A, Karlsson P, Berggren J, Martinsson T, Björck K, Nilsson S, et al. Genome-wide linkage scan for breast cancer susceptibility loci in Swedish hereditary non-BRCA1/2 families: suggestive linkage to 10q23.32-q25.3. Genes Chromosom Cancer. 2006;46(3):302–9.
Oldenburg R, Kroeze-Jansema K, Houwing-Duistermaat J, Bayley J, Dambrot C, van Asperen C, et al. Genome-wide linkage scan in Dutch hereditary non-BRCA1/2 breast cancer families identifies 9q21-22 as a putative breast cancer susceptibility locus. Genes Chromosom Cancer. 2008;47(11):947–56.
Rosa-Rosa J, Pita G, Urioste M, Llort G, Brunet J, Lázaro C, et al. Genome-wide linkage scan reveals three putative breast-cancer-susceptibility loci. Am J Hum Genet. 2009;84(2):115–22.
Aloraifi F, Boland M, Green A, Geraghty J. Gene analysis techniques and susceptibility gene discovery in non-BRCA1/BRCA2 familial breast cancer. Surg Oncol. 2015;24(2):100–9.
The authors thank the participating families and the physicians who referred them to our study. We would like to thank Ms. Gayani Anandagoda, research scientist at the Human Genetics Unit, Faculty of Medicine, University of Colombo, Sri Lanka who helped PW to implement and run the LINUX platform on the lab computer. We also would like to thank the Australian Genomic Research Facility (Melbourne, Australia) where SNP genotyping of study participants was performed.
This study was funded by a PhD project grant awarded to PW by the University Grants Commission, Sri Lanka (UGC/VC/DRIC/PG2017 (II)/RUH/02). The funding body did not play any role in the study design, collection, analysis, and interpretation of data and in writing the manuscript. JEBW and AMM were supported by the Intramural Research Program of the National Human Genome Research Institute, National Institutes of Health, USA.
Ethics approval and consent to participate
Written, informed consent from all study participants was obtained. Ethical clearance to conduct this study was obtained from the Ethics Review Committee, Faculty of Medicine, University of Colombo, Sri Lanka [EC-17-136]. All those undergoing genetic testing were offered comprehensive pre- and post-test counseling and written informed consent was obtained prior to testing. Clinical data including gender, age, personal and family cancer histories were obtained.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Family 1. Figure S2. Family 2. Figure S3. Family 3. Figure S4. Family 4. Figure S5. Plot of two-point LOD scores for Family 1. Figure S6. Plot of two-point LOD scores for Family 2. Figure S7. Plot of two-point LOD scores for Family 3. Figure S8. Plot of two-point LOD scores for Family 4.
Details of the cancer in affected relatives of the breast cancer index patient in each family. Table S2. The health status and the number of family members who provided biospecimens for genotyping. Table S3. A description of top LOD scores of family 1. Table S4. A description of top LOD scores of family 2. Table S5. A description of top LOD scores of family 3. Table S6. A description of top LOD scores of family 4.
About this article
Cite this article
Wijesiriwardhana, P., Musolf, A.M., Bailey-Wilson, J.E. et al. Genome-wide linkage search for cancer susceptibility loci in a cohort of non BRCA1/2 families in Sri Lanka. BMC Res Notes 15, 190 (2022). https://doi.org/10.1186/s13104-022-06081-5