Question: About 10x chromium Gene assembly
0
gravatar for r00628112
4 months ago by
r0062811210
r0062811210 wrote:

Hello everyone:

I followed 10x chromium website tips and ran the supernova program. But the assembly results are not very well. I check the fasta file, the length of contigs (top 10 contigs) are only about 10% than the the length in published genome. Also, I tried another assembly software (Soapdenovo2), but the assembly quality was not changed. Anyone has similar experiences? How could I improve the assembly quality? Thank you so much.

The report information is as below:

  • INPUT
  • 105.97 M = READS = number of reads; ideal 800M-1200M for human
  • 138.50 b = MEAN READ LEN= mean read length after trimming; ideal 140
  • 61.71 x = RAW COV= raw coverage; ideal ~56
  • 52.13 x = EFFECTIVE COV= effective read coverage; ideal ~42 for raw 56x
  • 89.35 % = READ TWO Q30= fraction of Q30 bases in read 2; ideal 75-85
  • 344.00 b = MEDIAN INSERT= median insert size; ideal 350-400
  • 90.49 % = PROPER PAIRS= fraction of proper read pairs; ideal >= 75
  • 1.00 = BARCODE FRACTION= fraction of barcodes used; between 0 and 1
  • 257.60 Mb = EST GENOME SIZE = estimated genome size
  • 14.70 % = REPETITIVE FRAC = genome repetitivity index
  • 0.15 % = HIGH AT FRACTION= high AT index
  • 37.45 % = ASSEMBLY GC CONTENT = GC content of assembly
  • 0.19 % = DINUCLEOTIDE FRACTION = dinucleotide content
  • 30.25 Kb = MOLECULE LEN = weighted mean molecule size; ideal 50-100
  • 167.12 = P10= molecule count extending 10 kb on both sides
  • 384.00 b = HETDIST = mean distance between heterozygous SNPs
  • 4.38 % = UNBAR = fraction of reads that are not barcoded
  • 78.00 = BARCODE N50= N50 reads per barcode
  • 5.26 % = DUPS = fraction of reads that are duplicates
  • 60.76 % = PHASED = nonduplicate and phased reads; ideal 45-50

  • OUTPUT
  • 1.79 K = LONG SCAFFOLDS = number of scaffolds >= 10 kb
  • 10.50 Kb = EDGE N50= N50 edge size
  • 67.10 Kb = CONTIG N50= N50 contig size
  • 223.04 Kb = PHASEBLOCK N50 = N50 phase block size
  • 238.07 Kb = SCAFFOLD N50= N50 scaffold size
  • 17.26 % = MISSING 10KB= % of base assembly missing from scaffolds >= 10 kb
  • 174.75 Mb = ASSEMBLY SIZE= assembly size (only scaffolds >= 10 kb)

ALARMS

- The length-weighted mean molecule length is 30252.76 bases. The molecule length estimation was successful, however, ideally we would expect a larger value. Standard methods starting from blood can yield 100 kb or larger DNA, but it can be difficult to obtain long DNA from other sample types. Short molecules may reduce the scaffold and phase block N50 length, and could result in misassemblies. We have observed assembly quality to improve with longer DNA.

longreads genomeassembly • 125 views
ADD COMMENTlink modified 4 months ago • written 4 months ago by r0062811210
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2173 users visited in the last hour