ABYSS stuck on "Finding adjacent k-mer..."
0
0
Entering edit mode
7.3 years ago
sasha ▴ 70

I just compiled the ABYSS from github on my RedHat Linux cluster (using SLURM), but it hangs at the "Finding adjacenct k-mer..." step when configured to run with multiple processors. Unlike previous posts on this topic, it is not related to the length of kmer, since the it also fails on the test data set.

Here are the configuration options, using gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-15)

 ./configure \
  --prefix=/apps/unit/MikheyevU/sasha/abyss-gcc \
  --enable-maxk=288 \
  --with-mpi=/apps/free/openmpi.gcc/1.8.6 \
  --with-boost=/apps/free/boost/1.57.0/include/boost

Here is the code used to run ABYSS, with the code in yellow added later based on older posts, but which does not help.

sbatch \
  --time 7-00:00:00 \
  --job-name=abyss \
  --mem-per-cpu=1G \
  --partition=largemem \
  --nodes=4 \
  --ntasks-per-node=4 \
  --wrap "module load openmpi.gcc/1.8.6; `export OMP_NUM_THREADS=4; export mpirun=\"mpirun --mca btl_sm_eager_limit 16000 --mca btl_openib_eager_limit 16000\"`;  abyss-pe aligner=map v=-vv k=25 name=test np=4 j=4 in='reads1.fastq reads2.fastq' "

And here is the output:

mpirun --mca btl_sm_eager_limit 16000 --mca btl_openib_eager_limit 16000 -np 4 ABYSS-P -k25 -q3 -vv   --coverage-hist=coverage.hist -s test-bubbles.fa  -o test-1.fa reads1.fastq reads2.fastq 
3: Running on host sango40102
ABySS 1.9.0
ABYSS-P -k25 -q3 -vv --coverage-hist=coverage.hist -s test-bubbles.fa -o test-1.fa reads1.fastq reads2.fastq
Running on 4 processors
0: Running on host sango40102
1: Running on host sango40102
2: Running on host sango40102
3: SetState 0 (was 18)
1: SetState 0 (was 18)
0: SetState 0 (was 18)
0: Reading `reads1.fastq'...
2: SetState 0 (was 18)
2: LoadSequences: 0 s
2: SetState 18 (was 0)
1: Reading `reads2.fastq'...
3: LoadSequences: 0 s
3: SetState 18 (was 0)
1: Read 20000 reads. 1: Hash load: 226827 / 268435456 = 0.000845 using 176 MB
1: LoadSequences reads2.fastq: 2.65 s
1: LoadSequences: 2.65 s
1: SetState 18 (was 0)
0: Read 20000 reads. 0: Hash load: 235145 / 268435456 = 0.000876 using 176 MB
2: SetState 1 (was 18)
2: Loaded 235349 k-mer.
3: SetState 1 (was 18)
3: Loaded 232481 k-mer.
0: LoadSequences reads1.fastq: 1.54 s
0: LoadSequences: 1.54 s
0: SetState 1 (was 0)
0: Loaded 235145 k-mer.
1: SetState 1 (was 18)
1: Loaded 235138 k-mer.
3: Hash load: 232481 / 1048576 = 0.222 using 45 MB
1: Hash load: 235138 / 1048576 = 0.224 using 51.6 MB
2: Hash load: 235349 / 1048576 = 0.224 using 44.3 MB
0: Hash load: 235145 / 1048576 = 0.224 using 50.6 MB
Loaded 938113 k-mer. At least 75 MB of RAM is required.
1: SetState 18 (was 1)
2: SetState 18 (was 1)
3: SetState 18 (was 1)
Minimum k-mer coverage is 22
0: Coverage: 22    Reconstruction: 307
0: Coverage: 5.48    Reconstruction: 118716
0: Coverage: 2.45    Reconstruction: 227624
0: Coverage: 2.24    Reconstruction: 227624
Using a coverage threshold of 2...
The median k-mer coverage is 5
The reconstruction is 227624
The k-mer coverage threshold is 2.24
Setting parameter e (erode) to 2
Setting parameter E (erodeStrand) to 1
Setting parameter c (coverage) to 2.24
0: SetState 2 (was 1)
Finding adjacenct k-mer...
3: SetState 2 (was 18)
1: SetState 2 (was 18)
2: SetState 2 (was 18)

During this time all processors are running at 100% and use memory, but there is no output. Any ideas would be most welcome!

Assembly abyss • 2.4k views
ADD COMMENT
0
Entering edit mode

If I compile abyss with maxk=96 it does work, as suggested in an old Google Groups post. A more recent post suggests, this is due to limitations of some processors/compilers. I tried using Intel compilers and gcc, but no luck.

ADD REPLY

Login before adding your answer.

Traffic: 1461 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6