Skip to main content
Figure 3 | BMC Research Notes

Figure 3

From: Integrating heterogeneous sequence information for transcriptome-wide microarray design; a Zebrafish example

Figure 3

Effect of the Similarity Threshold on Transcript Clustering. The result of the mapping of the sequences of Ensembl 57, Vega 37, RefSeq 39, and UniGene 117 onto each other using the procedure as specified in the Description section and Figure 2 is shown. Similarity thresholds T and U were varied from 60 to 99 but were kept equal to one another in each mapping run. In total 133,691 sequences have been mapped. Depicted are the total number of clusters, the clusters with 1 and the clusters with 2 or more members and the number of subsequences. At T = U = 60 45% of all TC sequences is clustered into clusters with 2 or more members. This percentage drops to 12% at T = U = 99. This decrease is due to the higher stringency with respect to the identity of the BLAST query to the BLAST target sequences (T) as well as to the higher limit at which a smaller BLAST query sequence is added to the TC of the larger BLAST target sequence (a higher U facilitates the calling of subsequences). At T = U = 95 a sharp rise in the number of clusters with only one member is observed. However, the increase of the single member clusters is much larger than the increase of the number of subsequences.

Back to article page