From: SigWin-detector: a Grid-enabled workflow for discovering enriched windows of genomic features related to DNA sequences

Using a mmFDR method to detect RIDGEs in a human transcriptome map. Schematic representation of the moving median false discovery rate (mmFDR) procedure identifying regions of high and low density of gene expression (RIDGEs and anti-RIDGEs, respectively) [4]. (A) Input sequence, a human transcriptome map (HTM), i.e., expression values of genes ordered by their chromosome location (cyan; chromosome 6). (B) mm(w), moving medians of the HTM for a given window size S. (C) Determination of the high and low mmFDR thresholds at a given level α: The high threshold m k is the smallest gene expression value for which the m m k f ( m ) / m m k g ( m ) α MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWaaabeaeaacqWGMbGzcqGGOaakcqWGTbqBcqGGPaqkcqGGVaWldaaeqaqaaiabdEgaNjabcIcaOiabd2gaTjabcMcaPiabgsMiJkabeg7aHbWcbaGaemyBa0MaeyyzImRaemyBa02aaSbaaWqaaiabdUgaRbqabaaaleqaniabggHiLdaaleaacqWGTbqBcqGHLjYScqWGTbqBdaWgaaadbaGaem4AaSgabeaaaSqab0GaeyyeIuoaaaa@48F1@ , here f(m) is the theoretical probability distribution of mm(w), and g(m) is the observed distribution of mm(w). (In [4], f(m) is estimated by simple sampling). Similarly, the low threshold m j is the largest gene expression value for which m m j f ( m ) / m m j g ( m ) α MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWaaabeaeaacqWGMbGzcqGGOaakcqWGTbqBcqGGPaqkcqGGVaWldaaeqaqaaiabdEgaNjabcIcaOiabd2gaTjabcMcaPiabgsMiJkabeg7aHbWcbaGaemyBa0MaeyizImQaemyBa02aaSbaaWqaaiabdQgaQbqabaaaleqaniabggHiLdaaleaacqWGTbqBcqGHKjYOcqWGTbqBdaWgaaadbaGaemOAaOgabeaaaSqab0GaeyyeIuoaaaa@48CB@ . (D) Selection of significant windows in chromosome 6: RIDGEs (in red) all windows for which the median gene expression is higher than or equal to m k ; anti-RIDGEs (in blue) all windows for which the median gene expression is lower than or equal to m j . (E) Output RIDGEOGRAM of chromosome 6. Each row (y-axis) in the RIDGEOGRAM represents a window size, ranging from S = 3 to S = M (the number of genes on the chromosome). Each column (x-axis) represents a sliding window number, ranging from w = S/2 to w = M-S/2 (hence the triangular form). Color is used to mark window medians significantly above (red) or below (blue) the genome-wide median. The scheme shows median expression data for window size S = 69 and FDR thresholds level α = 5%.

