Question

About 10x chromium Gene assembly

0

Entering edit mode

3.8 years ago

r00628112 ▴ 10

Hello everyone:

I followed 10x chromium website tips and ran the supernova program. But the assembly results are not very well. I check the fasta file, the length of contigs (top 10 contigs) are only about 10% than the the length in published genome. Also, I tried another assembly software (Soapdenovo2), but the assembly quality was not changed. Anyone has similar experiences? How could I improve the assembly quality? Thank you so much.

The report information is as below:

INPUT
105.97 M = READS = number of reads; ideal 800M-1200M for human
138.50 b = MEAN READ LEN= mean read length after trimming; ideal 140
61.71 x = RAW COV= raw coverage; ideal ~56
52.13 x = EFFECTIVE COV= effective read coverage; ideal ~42 for raw 56x
89.35 % = READ TWO Q30= fraction of Q30 bases in read 2; ideal 75-85
344.00 b = MEDIAN INSERT= median insert size; ideal 350-400
90.49 % = PROPER PAIRS= fraction of proper read pairs; ideal >= 75
1.00 = BARCODE FRACTION= fraction of barcodes used; between 0 and 1
257.60 Mb = EST GENOME SIZE = estimated genome size
14.70 % = REPETITIVE FRAC = genome repetitivity index
0.15 % = HIGH AT FRACTION= high AT index
37.45 % = ASSEMBLY GC CONTENT = GC content of assembly
0.19 % = DINUCLEOTIDE FRACTION = dinucleotide content
30.25 Kb = MOLECULE LEN = weighted mean molecule size; ideal 50-100
167.12 = P10= molecule count extending 10 kb on both sides
384.00 b = HETDIST = mean distance between heterozygous SNPs
4.38 % = UNBAR = fraction of reads that are not barcoded
78.00 = BARCODE N50= N50 reads per barcode
5.26 % = DUPS = fraction of reads that are duplicates
60.76 % = PHASED = nonduplicate and phased reads; ideal 45-50

OUTPUT
1.79 K = LONG SCAFFOLDS = number of scaffolds >= 10 kb
10.50 Kb = EDGE N50= N50 edge size
67.10 Kb = CONTIG N50= N50 contig size
223.04 Kb = PHASEBLOCK N50 = N50 phase block size
238.07 Kb = SCAFFOLD N50= N50 scaffold size
17.26 % = MISSING 10KB= % of base assembly missing from scaffolds >= 10 kb
174.75 Mb = ASSEMBLY SIZE= assembly size (only scaffolds >= 10 kb)

ALARMS

- The length-weighted mean molecule length is 30252.76 bases. The molecule length estimation was successful, however, ideally we would expect a larger value. Standard methods starting from blood can yield 100 kb or larger DNA, but it can be difficult to obtain long DNA from other sample types. Short molecules may reduce the scaffold and phase block N50 length, and could result in misassemblies. We have observed assembly quality to improve with longer DNA.

Longreads Genomeassembly • 697 views

ADD COMMENT • link 3.8 years ago by r00628112 ▴ 10