Question: Scaffolding with SSPACE returned scaffold the same as initial contigs set
0
gravatar for pbigbig
2.3 years ago by
pbigbig190
United States
pbigbig190 wrote:

Hi all,

I have used Minia to assemble a contigs set from my paired end reads, as Minia instructed that it doesn't use pairing information for constructing assembly, then I continue to try SSPACE to exploit this pairing infomation from the SAME LIBRARY (which I have used to construct contigs set) for scaffolding. But after tried different parameters in SSPACE (k, a or parameters of lib.txt), it ALWAYS returns the scaffolded set exactly the SAME as initial contigs set. Did I miss something? Even if I didn't put the best parameters, I would be obtained a scaffold which might be somehow different from the initial contigs set, but here they are exactly the same.
Any suggestion is greatly welcomed, thanks a lot!


Here are my input and scaffolded summary:

my lib.txt: k71 bowtie 1.fastq 2.fastq 440 0.75 FR


Required inputs:
      -l = lib.txt
            Number of paired files = 1
      -s = k71contigs.fasta
      -b = k71origin

Optional inputs:
      -x = 0
      -z = 0
      -k = 10
      -g = 0
      -a = 0.7
      -n = 10
      -T = 16
      -p = 1



READING READS k71:
------------------------------------------------------------
      Total inserted pairs = 46314881
------------------------------------------------------------

LIBRARY k71 STATS:
################################################################################

MAPPING READS TO CONTIGS:
------------------------------------------------------------
      Number of single reads found on contigs = 6827738
      Number of read-pairs used for pairing contigs / total pairs = 657371 / 657371
------------------------------------------------------------

READ PAIRS STATS:
      Assembled pairs: 657371 (1314742 sequences)
            Satisfied in distance/logic within contigs (i.e. -> <-, distance on target: 440 +/-330): 645344
            Unsatisfied in distance within contigs (i.e. distance out-of-bounds): 9033
            Unsatisfied pairing logic within contigs (i.e. illogical pairing ->->, <-<- or <-->): 0
            ---
            Satisfied in distance/logic within a given contig pair (pre-scaffold): 2175
            Unsatisfied in distance within a given contig pair (i.e. calculated distances out-of-bounds): 819
            ---
      Total satisfied: 647519 unsatisfied: 9852

      Estimated insert size statistics (based on 645344 pairs):
            Mean insert size = 296
            Median insert size = 248

REPEATS:
      Number of repeated edges = 0
------------------------------------------------------------
################################################################################

SUMMARY:
------------------------------------------------------------
      Inserted contig file;
            Total number of contigs = 882414
            Sum (bp) = 762086901
                  Total number of N's = 0
                  Sum (bp) no N's = 762086901
            GC Content = 38.49%
            Max contig size = 55233
            Min contig size = 143
            Average contig size = 863
            N25 = 4488
            N50 = 2157
            N75 = 837

      After scaffolding k71:
            Total number of scaffolds = 882414
            Sum (bp) = 762086901
                  Total number of N's = 0
                  Sum (bp) no N's = 762086901
            GC Content = 38.49%
            Max scaffold size = 55233
            Min scaffold size = 143
            Average scaffold size = 863
            N25 = 4488
            N50 = 2157
            N75 = 837

------------------------------------------------------------
minia sspace • 1.5k views
ADD COMMENTlink modified 2.3 years ago by Damian Kao14k • written 2.3 years ago by pbigbig190
1
gravatar for Damian Kao
2.3 years ago by
Damian Kao14k
USA
Damian Kao14k wrote:

This line in particular in the scaffolding summary:

Satisfied in distance/logic within a given contig pair (pre-scaffold): 2175

Means only 2175 read pairs were found to connect contig pairs. So out of the ~600k paired reads, most were mapped onto the same contig and only 2175 mapped on two different contigs and satisfies the distance criteria. That's probably not enough for SSPACE to establish any scaffolds, depending on your thresholds for establishing links.

Your Minia assembly was good in the sense that it was able to fill the gaps between most of your paired reads (645344/657371 pairs satisfied the distance within one contig). If you want longer scaffolds, you'll probably need mate pair libraries.

edit**

It looks like you have ~6million single reads in your fastqs according to the summary report. Did you rename your fastq headers? Maybe SSPACE is not recognizing your pair-end reads correctly due to your header names? 

ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by Damian Kao14k

Thank you very much,

It is really bizzare that after trying different parameters, I still get this same result, I think even if it got only 2175 linking reads for scaffolding, SSPACE still can merge some contigs, isn't it?
I didn't rename any fastq headers, I checked them by head and tail command and confirmed they remain corresponding paired end. Thanks!
 

ADD REPLYlink written 2.3 years ago by pbigbig190
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 541 users visited in the last hour