Question: Abyss-pe: contigs and scaffolds are identical
0
gravatar for joy72511
9 months ago by
joy725110
joy725110 wrote:

Hi,

I have run Abyss-pe (v2.2.4) with different kmers (31-69) using illumina reads (2*150). But all of them have identical contigs.fa and scaffold.fa. Is this normal?

Thanks for your help.

Joy

abyss-pe command:

nohup abyss-pe k=51 v=-v name=x32 in='../trimmed/Xylem32_R1_trimmed.fq ../trimmed/Xylem32_R2_trimmed.fq' &> a32.51.oe &

abyss-fac:

abyss-fac   x32-unitigs.fa x32-contigs.fa x32-scaffolds.fa |tee x32-stats.tab

n   n:500   L50 min N75 N50 N25 E-size  max sum name

2404872 9439    3962    500 536 585 693 683 6526    5884968 x32-unitigs.fa

2404832 9431    3957    500 536 585 694 687 6526    5888336 x32-contigs.fa

2404832 9431    3957    500 536 585 694 687 6526    5888336 x32-scaffolds.fa

abyss-map :

abyss-map -v  -j2 -l40    ../trimmed/Xylem32_R1_trimmed.fq ../trimmed/Xylem32_R2_trimmed.fq x32-6.fa \
    |abyss-fixmate -v  -l40  -h x32-6.hist \
    |sort -snk3 -k4 \
    |DistanceEst -v  --dot --median -j2 -k51  -l40 -s1000 -n10  -o x32-6.dist.dot x32-6.hist
Reading from standard input...
Reading `x32-6.fa'...
Using 202 MB of memory and 83.9 B/sequence.
Reading `x32-6.fa'...
Building the suffix array...
Building the Burrows-Wheeler transform...
Building the character occurrence table...
Read 286 MB in 2404832 contigs.
Using 2.71 GB of memory and 9.49 B/bp.
Read 1000000 alignments. Hash load: 0 / 4 = 0 using 369 kB.
Read 2000000 alignments. Hash load: 2 / 4 = 0.5 using 369 kB.
Read 3000000 alignments. Hash load: 0 / 4 = 0 using 369 kB.
Read 4000000 alignments. Hash load: 0 / 4 = 0 using 369 kB.
Read 5000000 alignments. Hash load: 0 / 4 = 0 using 369 kB.
Read 6000000 alignments. Hash load: 0 / 4 = 0 using 369 kB.
Read 7000000 alignments. Hash load: 0 / 4 = 0 using 369 kB.
Read 8000000 alignments. Hash load: 0 / 4 = 0 using 369 kB.
Read 9000000 alignments. Hash load: 0 / 4 = 0 using 369 kB.
Read 10000000 alignments. Hash load: 0 / 4 = 0 using 369 kB.
Read 11000000 alignments. Hash load: 0 / 4 = 0 using 369 kB.
Read 12000000 alignments. Hash load: 0 / 4 = 0 using 369 kB.
Read 13000000 alignments. Hash load: 0 / 4 = 0 using 369 kB.
Read 14000000 alignments. Hash load: 2 / 4 = 0.5 using 369 kB.
Read 15000000 alignments. Hash load: 0 / 4 = 0 using 369 kB.
Read 16000000 alignments. Hash load: 2 / 4 = 0.5 using 369 kB.
Read 17000000 alignments. Hash load: 2 / 4 = 0.5 using 369 kB.

Mapped 15254206 of 17394198 reads (87.7%)

Mapped 13611151 of 17394198 reads uniquely (78.3%)

Read 17394198 alignments

Mateless         0

Unaligned   683759  7.86%

Singleton   772474  8.88%

FR         2758829  31.7%

RF              61  0.000701%

FF              21  0.000241%

Different  4481955  51.5%

Total      8697099
abyss abyss-pe assembly • 363 views
ADD COMMENTlink modified 9 months ago by h.mon32k • written 9 months ago by joy725110

Usually this is not normal behaviour indeed (though it can happen).

before I can give a conclusive answer: could you post the complete run log of the abyss pipeline?

and do you mean each Kmer gave the same result or for each kmer the contig and scaffold gave the same result?

Can we correctly assume you're working with genome data btw? (and thus not transcriptome?)

ADD REPLYlink modified 9 months ago • written 9 months ago by lieven.sterck9.4k

Thanks for your reply. The complete run log is too many words to post. I don't know how to do this. But I also posted the question on google groups. That platform can carry files ([https://groups.google.com/forum/#!topic/abyss-users/SyTgYAj_iDU]). Different Kmer gave different result. But all of the results had same contigs and scaffolds. The data is genome data captured by probe designed by transcriptome.

ADD REPLYlink written 9 months ago by joy725110

I see, and I had a look at the google group post as well.

what Lauren mentioned there is exactly what I was referring to as well. (and would also have been my suggestion).

Concerning your data: so this is not a full genome WGS dataset? but some captured data? if so, it's not surprising to have such low stats. For an average conifer genome (and I do have quite some experience in that) the assembly result is very very small, like 1000 - 5000 times too small.

can you confirm again that you are doing genome assembly and not transcriptome assembly?

ADD REPLYlink written 9 months ago by lieven.sterck9.4k

I'm not sure what you mean about genome assembly you mentioned.Does it mean that this sequence is used to assemble whole genomes? I didn't have the budget to do whole genome assembly. The data are sequenced from reduced representation libraries. The libraries are gDNA and captured by probes. I assembly the sequence for calling variant. Thank you for your advice and help.

ADD REPLYlink written 9 months ago by joy725110

So you have a reference genome?

ADD REPLYlink written 9 months ago by WouterDeCoster45k
0
gravatar for Mensur Dlakic
9 months ago by
Mensur Dlakic8.1k
USA
Mensur Dlakic8.1k wrote:

It is normal to have the same number of contigs and scaffolds at the end of assembly.

ADD COMMENTlink written 9 months ago by Mensur Dlakic8.1k

Thanks for your reply.

ADD REPLYlink written 9 months ago by joy725110
0
gravatar for h.mon
9 months ago by
h.mon32k
Brazil
h.mon32k wrote:

The fact the contigs and scaffolds are identical suggests the average insert size of the library is too short: probably most paired reads overlap, and there is no "jumping" information available. I don`t have experience with ABySS, but SPAdes is able to scaffold a few contigs, if the insert size is not too small.

On a side note: what are you assembling? The sum of the contig lengths suggests it is a bacterial genome, if this is the case, you have a very, very poor assembly.

ADD COMMENTlink written 9 months ago by h.mon32k

Thanks for your reply. The insert size indeed smaller than I set. I assumed it would be 200 bp, but it only has less than 100 bp. The data is a conifer genome data captured by probe designed by transcriptome.

ADD REPLYlink written 9 months ago by joy725110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2464 users visited in the last hour