Skip to main content

Table 3 Homo sapiens chr 14 assembly evaluation

From: De novo likelihood-based measures for comparing genome assemblies

 

Contigs

Scaffolds

   

Assembler

LAP

LAP

N50 (kb)

CN50 (kb)

LAP

LAP

N50 (kb)

CN50 (kb)

CGAL Score

Unaligned

Unaligned

 

reads

mates

  

reads

mates

   

reads

mates

          

(fraction)

(fraction)

ABySS

-18.473

-23.801

2

2

-18.474

-23.787

2.1

2

-15.21×108

0.257

0.504

Allpaths-LG

-15.813

-21.413

36.5

21

-15.824

-21.314

81,647

4,702

-13.11×108

0.115

0.239

Bambus2

-18.606

-23.474

5.9

4.3

-18.642

-23.343

324

161

-

0.258

0.422

CABOG

-15.625

-21.128

45.3

23.7

-15.626

-21.041

393

26

-12.25 × 1 08

0.109

0.229

MSR-CA

-16.421

-22.428

4.9

4.3

-16.436

-21.861

893

94

-

0.122

0.276

SGA

-15.712

-22.990

2.7

2.7

-16.909

-22.326

83

79

-

0.134

0.328

SOAPdenovo

-15.702

-21.705

14.7

7.4

-15.734

-21.594

455

214

*

0.101

0.269

Velvet

-18.000

-23.468

2.3

2.1

-18.140

-23.375

1,190

27

-

0.214

0.442

truth

-15.466

-21.001

107,349.50

107,349.50

-15.466

-21.002

107,349.50

107,349.50

-11.25 ×108

0.093

0.211

  1. Assembly likelihood scores for human chromosome 14 from the GAGE project [15] using a 10,000 read sample. The results are presented separately for the contigs and scaffolds and include the number of unassembled reads (singletons), the LAP scores computed on unmated reads (LAP reads) or mate-pairs (LAP mates), the N50 contig/scaffold sizes (N50), and the reference-corrected N50 contig/scaffold sizes (CN50). The best (maximum) value for each genome-measure combination is highlighted in bold. The results for the reference assembly (either complete genome or high-quality draft) is given in the row marked truth. In addition, we provide the results for a closely related strain and species. CGAL scores calculated from the long insert library were taken from the CGAL publication. The authors only provided scores for the top three assemblies (Bowtie2 could not successfully map reads to the SOAPdenovo assembly). All values, except the LAP and CGAL scores, were taken from the GAGE publication. A threshold probability of 1e-30 was used for calculating the LAP scores. The standard deviation for both the LAP’s reads and LAP’s mates scores is 0.15.