Question: ABySS assembly runs forever
1
gravatar for karaskova
4.0 years ago by
karaskova10
karaskova10 wrote:

I recently installed AByss version 1.5.2

The installation did not produce any errors and executables were correctly produced.

 

Running it on a small dataset it takes forever.

With version 1.3.7 on the same dataset it runs with 8 processors in about ten minutes. With version 1.5.2 after 15 hours it is still running, same parameters, including k=71. I also tried different kmer size with similar problem.

 

I installed it using the following commands:

./configure  --with-mpi=${LIB}/lib/openmpi/1.6.1 --with-boost=${LIB}/lib/boost/1.55/include CPPFLAGS=-I${LIB}/lib/sparsehash/2.0.2/include --enable-maxk=128

 

The command I issued is (I have logs for k=191 and installation k=256 only right now):

abyss-pe k=191 l=1 verbose=-v aligner=map b=1000000 p=0.95 s=500 np=8 n=10 name='Medicago_truncatula_k191_b1000000_p0.95_s500' lib='lib1' lib1='${PATH}/10B23_S36_L001_clean_1.fastq.gz ${PATH}/10B23_S36_L001_clean_2.fastq.gz' Medicago_truncatula_k191_b1000000_p0.95_s500-1.fa >Medicago_truncatula_k191_b1000000_p0.95_s500.std 2>Medicago_truncatula_k191_b1000000_p0.95_s500.err

 

Here stderr:

--------------------------------------------------------------------------

[[21301,1],1]: A high-performance Open MPI point-to-point messaging module

was unable to find any relevant network interfaces:

 

Module: OpenFabrics (openib)

  Host: tocai

 

Another transport will be used instead, although this may result in

lower performance.

--------------------------------------------------------------------------

[tocai:32056] 7 more processes have sent help message help-mpi-btl-base.txt / btl:no-nics

[tocai:32056] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages

`/projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_2.fastq.gz': discarded 234529 reads shorter than 191 bases

`/projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_2.fastq.gz': discarded 9783 reads containing non-ACGT characters

`/projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_1.fastq.gz': discarded 197966 reads shorter than 191 bases

 

 

 

Here stdout:

/iga/stratocluster/packages/lib/openmpi/1.6.1/bin/mpirun -np 8 ABYSS-P -k191 -q3 -b1000000   --coverage-hist=coverage.hist -s Medicago_truncatula_k191_b1000000_p0.95_s500-bubbles.fa  -o Medicago_truncatula_k191_b1000000_p0.95_s500-1.fa /projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_1.fastq.gz /projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_2.fastq.gz

ABySS 1.5.2

ABYSS-P -k191 -q3 -b1000000 --coverage-hist=coverage.hist -s Medicago_truncatula_k191_b1000000_p0.95_s500-bubbles.fa -o Medicago_truncatula_k191_b1000000_p0.95_s500-1.fa /projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_1.fastq.gz /projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_2.fastq.gz

Running on 8 processors

4: Running on host tocai

0: Running on host tocai

2: Running on host tocai

3: Running on host tocai

6: Running on host tocai

7: Running on host tocai

5: Running on host tocai

1: Running on host tocai

0: Reading `/projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_1.fastq.gz'...

1: Reading `/projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_2.fastq.gz'...

0: Loaded 2077319 k-mer.

4: Loaded 2039873 k-mer.

2: Loaded 2063957 k-mer.

6: Loaded 2062170 k-mer.

7: Loaded 2062877 k-mer.

3: Loaded 2017986 k-mer.

5: Loaded 2058918 k-mer.

1: Loaded 2057399 k-mer.

Loaded 16440499 k-mer. At least 1.18 GB of RAM is required.

Minimum k-mer coverage is 20

Using a coverage threshold of 16...

The median k-mer coverage is 252

The reconstruction is 103086

The k-mer coverage threshold is 15.8745

Setting parameter e (erode) to 16

Setting parameter E (erodeStrand) to 1

Setting parameter c (coverage) to 15.8745

Finding adjacenct k-mer...

 

 

I believe I can ignore the warning I had in the standard error. I always got it with ABySS 1.3.7 and it is probably due to my hardware infrastructure.

 

Standard output always stucks at "Finding adjacenct k-mer..." after computing the coverage.hist file. Memory usage is limited while all 8 CPUs run at 100%

 

Any hint on what might be the problem?

 

Thanks in advance,

Simone

abyss assembly • 3.0k views
ADD COMMENTlink modified 3.6 years ago by neaptide0 • written 4.0 years ago by karaskova10
2
gravatar for benv
4.0 years ago by
benv710
Canada
benv710 wrote:

ABySS has a known issue with deadlocking when using higher k values, and I suspect that may be the source of your problem.

Fortunately, there is a workaround.  From the little-known ABySS User's FAQ:

--- BEGIN QUOTE ---

2. My ABySS assembly jobs hang when I run them with high k values! (e.g. k=250)

The way that OpenMPI handles messages changes when the message sizes exceeded a certain size called the "eager send limit". In ABySS, message size depends directly on k, and when the eager send limit is exceeded, assembly jobs will deadlock.

The best workaround for this problem is to explicitly set the eager send limit. This can be done by setting an environment variable called mpirun in your cluster job script.

Example:

#!/bin/sh
PATH=/home/joe/abyss-1.3.7/maxk_96/bin:$PATH
export mpirun="mpirun --mca btl_sm_eager_limit 16000 --mca btl_openib_eager_limit 16000"
abyss-pe k=96 name=assembly in='read1.fastq read2.fastq'

The values for the btl_sm_eager_limit and btl_openib_eager_limit are in bytes, and it is usually fine to set them both to the same value. The formula for determining the appropriate value is:

eager_limit >= (max_k/4 + 32) * 100

--- END QUOTE ---

ADD COMMENTlink modified 4.0 years ago • written 4.0 years ago by benv710

Thanks a lot! It worked!

But now I have another problem on later stages while running the following command:

abyss-map  -j8 -l1    ${PATH}/test_1.fastq ${PATH}/test_2.fastq Medicago_truncatula_k191_b1000000_p0.95_s500-3.fa
Building the suffix array...
Building the Burrows-Wheeler transform...
Building the character occurrence table...
@HD     VN:1.4
@PG     ID:abyss-map    PN:abyss-map    VN:1.5.2        CL:abyss-map -j8 -l1 ${PATH}/test_1.fastq ${PATH}/test_2.fastq Medicago_truncatula_k191_b1000000_p0.95
_s500-3.fa
@SQ     SN:0    LN:8068
@SQ     SN:1    LN:9253
@SQ     SN:2    LN:495
@SQ     SN:3    LN:469
@SQ     SN:5    LN:2084
@SQ     SN:6    LN:3465
@SQ     SN:7    LN:4421
@SQ     SN:8    LN:4802
@SQ     SN:9    LN:5369
@SQ     SN:10   LN:11089
@SQ     SN:11   LN:200
@SQ     SN:12   LN:827
@SQ     SN:13   LN:1422
@SQ     SN:14   LN:420
@SQ     SN:15   LN:426
@SQ     SN:16   LN:1171
@SQ     SN:17   LN:1993
@SQ     SN:18   LN:6176
@SQ     SN:19   LN:9074
@SQ     SN:20   LN:242
@SQ     SN:21   LN:491
@SQ     SN:22   LN:1037
@SQ     SN:25   LN:2342
@SQ     SN:26   LN:225
@SQ     SN:27   LN:597
@SQ     SN:28   LN:1283
@SQ     SN:29   LN:1264
@SQ     SN:30   LN:6966
@SQ     SN:32   LN:209
@SQ     SN:33   LN:495
@SQ     SN:34   LN:746
@SQ     SN:35   LN:1123
@SQ     SN:36   LN:1748
@SQ     SN:37   LN:2697
@SQ     SN:38   LN:12519
@SQ     SN:39   LN:2137
Illegal instruction (core dumped)

 

I am trying abyss-bwamem and it works!

 

ADD REPLYlink written 4.0 years ago by karaskova10

Hmm.  Someone else reported that error recently too.  It looks like it caused by abyss-map trying to use the 'popcnt' instruction, which isn't available on all processors.   No fix available at the moment.   The best work around is to use a different aligner (as you have already figured out!)
 

ADD REPLYlink written 4.0 years ago by benv710

https://github.com/bcgsc/abyss/blob/master/Common/BitUtil.h#L29

static inline bool havePopcnt() { return cpuid(1).c & (1 << 23); }

This line is meant to check whether the CPU has a popcnt instruction. It doesn't seem to work on your system. Can you please report the output of

uname -a

head /proc/cpuinfo
ADD REPLYlink written 4.0 years ago by Shaun Jackman420

scalabrin@tocai:~$ uname -a

Linux tocai 2.6.32-504.8.1.el6.x86_64 #1 SMP Wed Jan 28 21:11:36 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

scalabrin@tocai:~$ head /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Xeon(R) CPU           E7320  @ 2.13GHz
stepping        : 11
microcode       : 187
cpu MHz         : 1603.000
cache size      : 2048 KB
physical id     : 0

ADD REPLYlink written 4.0 years ago by karaskova10

The Xeon E7320 does not have SSE4 and I believe does not support the popcnt instruction. http://www.cpu-upgrade.com/CPUs/Intel/Xeon/E7320.html

ABySS checks the CPUID instruction to see whether the popcnt instruction is supported. https://en.wikipedia.org/wiki/SSE4#POPCNT_and_LZCNT

My best guess is that possibly the compiler is magically inserting a popcnt instruction during optimization, though I would find that very surprising. Did you compile ABySS and are you running ABySS on the same machine? Which compiler did you use? What configure options did you use?

Cheers,

Shaun

 

ADD REPLYlink written 4.0 years ago by Shaun Jackman420

Dear Shaun,

I compiled and run ABySS on different machines.

Compiling machines is here:

<pre>

scalabrin@builder:~$ uname -a
Linux builder 2.6.32-504.8.1.el6.x86_64 #1 SMP Wed Jan 28 21:11:36 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
scalabrin@builder:~$ head /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 45
model name      : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
stepping        : 2
microcode       : 1808
cpu MHz         : 2000.000
cache size      : 20480 KB
physical id     : 0

</pre>

which looks very similar to the machine I used to run ABySS.

 

Compiler:

<pre>

gcc version 4.8.2

</pre>

Configure command:

<pre>

./configure  --with-mpi=${LIB}/lib/openmpi/1.6.1 --with-boost=${LIB}/lib/boost/1.55/include CPPFLAGS=-I${LIB}/lib/sparsehash/2.0.2/include --enable-maxk=128

</pre>

ADD REPLYlink written 4.0 years ago by karaskova10

Ah, mystery solved. This bug is fixed in the master branch of ABySS but has not been released. I should have checked this earlier. Sorry for the confusion. You can download an unreleased tarball of ABySS from GitHub here:

https://github.com/bcgsc/abyss/archive/master.tar.gz

No, sorry. This fix is in 1.5.2. Just as a sanity check, can you report the output of `abyss-map --version`?

ADD REPLYlink modified 4.0 years ago • written 4.0 years ago by Shaun Jackman420

abyss-map (ABySS) 1.5.2
Written by Shaun Jackman.

Copyright 2014 Canada's Michael Smith Genome Sciences Centre

Thanks,

Simone

ADD REPLYlink written 4.0 years ago by karaskova10

Please report `grep -m1 flags /proc/cpuinfo`

Sorry for the slow progress on this issue. I can't replicate it, so it's tricky to troubleshoot.

ADD REPLYlink written 4.0 years ago by Shaun Jackman420

scalabrin@tocai:~$ grep -m1 flags /proc/cpuinfo
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca lahf_lm dts tpr_shadow vnmi flexpriority

Thanks for your help, I really appreciate. No hurry!

ADD REPLYlink written 4.0 years ago by karaskova10

There is no popcnt instruction, as I expected, so the question remains to be solved why ABySS is attempting to use an instruction that's not available.

ADD REPLYlink written 4.0 years ago by Shaun Jackman420
0
gravatar for neaptide
3.6 years ago by
neaptide0
United States
neaptide0 wrote:

We just had the same exact problem with running abyss-pe (v 1.5.2) on cluster with LSF (bsub) as a parallel process as @karaskova 's OP and @benz 's reply.  The solution posted here worked for us even with the newer version of abyss. 

But wanted to add some additional information, in case some one else lands here with the same issues and cluster system, you probably will also have to export OMP_NUM_THREADS since abyss-pe is s hybrid MPI and OpenMP (Stage 1 is MPI capable of many cpus across multiple hosts, and Stage 2 is threaded on one host).

# For OpenMP
export OMP_NUM_THREADS=4 
# Increase MPI message size
export mpirun="mpirun --mca btl_sm_eager_limit 16000 --mca btl_openib_eager_limit 16000"
# Run 4 cpus on one host
bsub -n 4 -R "span[hosts=1]" abyss-pe -np 4 in='your data'
# Run 2 cpus on two hosts
bsub -n 4 -R "span[ptile=2]" abyss-pe -np 4 in='your data' 

 

Thanks so much for posting this issue. I would have wasted much more time had this not been raised here.  I even glanced at the FAQ at some time earlier but missed the relevance. 

ADD COMMENTlink written 3.6 years ago by neaptide0
1

You can also specify the number of MPI processes and number of OpenMP threads separately using the abyss-pe parameters np and j respectively. For example

abyss-pe np=64 j=12

Use 64 MPI processes and 12 OpenMP threads.

ADD REPLYlink written 3.6 years ago by Shaun Jackman420
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1866 users visited in the last hour