Question: Error value for safely scaffolding contigs with SSPACE
0
gravatar for deepti1rao
18 months ago by
deepti1rao30
deepti1rao30 wrote:

insert size histogram

Please find the insert size histogram, plotted using picard tool.

METRICS CLASS picard.analysis.InsertSizeMetrics

MEDIAN_INSERT_SIZE MODE_INSERT_SIZE MEDIAN_ABSOLUTE_DEVIATION MIN_INSERT_SIZE MAX_INSERT_SIZE MEAN_INSERT_SIZE STANDARD_DEVIATION READ_PAIRS PAIR_ORIENTATION WIDTH_OF_10_PERCENT WIDTH_OF_20_PERCENT WIDTH_OF_30_PERCENT WIDTH_OF_40_PERCENT WIDTH_OF_50_PERCENT WIDTH_OF_60_PERCENT WIDTH_OF_70_PERCENT WIDTH_OF_80_PERCENT WIDTH_OF_90_PERCENT WIDTH_OF_95_PERCENT WIDTH_OF_99_PERCENT SAMPLE LIBRARY READ_GROUP

279 273 36 20 41943682 283.265191 56.023413 197649736 FR 15 27 41 57 73 91 113 139 183 223 329

In addition, velvet predicted an insert size of 273 and std deviation 53 during contig assembly.

I have used the median size (279) and error of 0.5 to scaffold contigs with SSPACE.


Total inserted pairs = 220823893

LIBRARY reads STATS:

<h6>#</h6>

MAPPING READS TO CONTIGS:

Number of single reads found on contigs = 61676131

Number of read-pairs used for pairing contigs / total pairs = 23693047 / 23853682

READ PAIRS STATS: Assembled pairs: 23693047 (47386094 sequences) Satisfied in distance/logic within contigs (i.e. -> <-, distance on target: 273 +/-136.5): 9029076 Unsatisfied in distance within contigs (i.e. distance out-of-bounds): 129266 Unsatisfied pairing logic within contigs (i.e. illogical pairing ->->, <-<- or <-->): 35528 --- Satisfied in distance/logic within a given contig pair (pre-scaffold): 1342274 Unsatisfied in distance within a given contig pair (i.e. calculated distances out-of-bounds): 13156903 --- Total satisfied: 10371350 unsatisfied: 13321697

Estimated insert size statistics (based on 9029076 pairs): 
    Mean insert size = 261
    Median insert size = 260

Inserted contig file; Total number of contigs = 57206 Sum (bp) = 348942280 Total number of N's = 0 Sum (bp) no N's = 348942280 GC Content = 42.69% Max contig size = 123790 Min contig size = 500 Average contig size = 6099 N25 = 28352 N50 = 15957 N75 = 7400

**After scaffolding:**
    Total number of scaffolds = 39590
    Sum (bp) = 348541909
        Total number of N's = 31358
        Sum (bp) no N's = 348510551
    GC Content = 42.92%
    Max scaffold size = 244223
    Min scaffold size = 500
    Average scaffold size = 8803
    N25 = 55732
    N50 = 30443
    N75 = 13267

My question is, can I increase the error value so as to include more paired reads for scaffolding? As per my understanding, we are using just 9029076 out of 220823893 pairs. How far can I go with the error value, without causing misassemblies?

ADD COMMENTlink modified 18 months ago • written 18 months ago by deepti1rao30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 761 users visited in the last hour