5.2 years ago

I'm running a abyss-pe job using trimmed paired-end MiSeq data of sample with appr. genome size of 120Mb. I trimmed my data using Trimmomatic, which gives 4 output files from the 2 original input files: 2 files are paired sequences (forward and reverse) and 2 files are unpaired sequences (forward and reverse).

I've been having problems with assembling the 4 files, it seems that abyss keeps reading the files endlessly without finishing.

The code I use is:

abyss-pe k=11 np=8 -C directory/for/my/output name=name_of_my_output lib='lib1' lib1='/path/to/forward_paired_trimmed.fastq.gz /path/to/reverse_paired_trimmed.fastq.gz' se='/path/to/forward_unpaired_trimmed.fastq.gz /path/to/reverse_unpaired_trimmed.fastq.gz'


I've also tried the following with the same result:

abyss-pe k=11 np=8 -C directory/for/my/output name=name_of_my_output in='/path/to/forward_paired_trimmed.fastq.gz /path/to/reverse_paired_trimmed.fastq.gz' se='/path/to/forward_unpaired_trimmed.fastq.gz /path/to/reverse_unpaired_trimmed.fastq.gz'


However, when I run the analysis with out the 'se' command it finishes reading the reads in appr. 30 min.:

abyss-pe k=11 np=8 -C directory/for/my/output name=name_of_my_output in='/path/to/forward_paired_trimmed.fastq.gz /path/to/reverse_paired_trimmed.fastq.gz'


Any idea why abyss has problems reading in the files when I add the 'se' command?

EDIT: I'm using the latest abyss version 1.9.0

EDIT 2: abyss-pe has now been running for 1250 mins, and still reading in the files, I have -v=vv on, screen shot of the verbose shows this:

5.2 years ago
mastal511 ★ 2.1k

Your commands look OK, the abyss-pe documentation does say that se data would considerably slow down the abyss-fixmate stage.

Yes, I read that, but it's not even getting to that point. It just 'hangs' at reading in the files. I had one analysis turned on on Friday, and it was still reading the files this morning (Monday). I killed the process to play around with the settings and try to figure out what the issue is, without success

5.2 years ago

Ok, I figured what was happening, but still confused.

I previously had abyss deployed with a maxk set at 256, redeploying it with maxk set at 96 did the trick. The strange thing is that the abyss github readme.md says this:

The default maximum k-mer size is 64 and may be decreased to reduce memory usage or increased at compile time. This value must be a multiple of 32 (i.e. 32, 64, 96, 128, etc):

./configure --enable-maxk=96

From that I would assume any multitude of 32 should be possible, but apparently it's not. Anyone has any idea what the real maximum maxk setting is?

5.2 years ago
mastal511 ★ 2.1k

See this discussion of the problem in a previous thread:

ABySS assembly runs forever

Thanks, I'll try to add those export open-mpi lines to my script to see if that helps with larger maxk values and will post the results here.