I recently installed AByss version 1.5.2
The installation did not produce any errors and executables were correctly produced.
Running it on a small dataset it takes forever.
With version 1.3.7 on the same dataset it runs with 8 processors in about ten minutes. With version 1.5.2 after 15 hours it is still running, same parameters, including k=71. I also tried different kmer size with similar problem.
I installed it using the following commands:
./configure --with-mpi=${LIB}/lib/openmpi/1.6.1 --with-boost=${LIB}/lib/boost/1.55/include CPPFLAGS=-I${LIB}/lib/sparsehash/2.0.2/include --enable-maxk=128
The command I issued is (I have logs for k=191 and installation k=256 only right now):
abyss-pe k=191 \
l=1 \
verbose=-v \
aligner=map \
b=1000000 \
p=0.95 \
s=500 \
np=8 \
n=10 \
name='Medicago_truncatula_k191_b1000000_p0.95_s500' lib='lib1' \
lib1='${PATH}/10B23_S36_L001_clean_1.fastq.gz ${PATH}/10B23_S36_L001_clean_2.fastq.gz' \
Medicago_truncatula_k191_b1000000_p0.95_s500-1.fa \
>Medicago_truncatula_k191_b1000000_p0.95_s500.std \
2>Medicago_truncatula_k191_b1000000_p0.95_s500.err
Here stderr:
--------------------------------------------------------------------------
[[21301,1],1]: A high-performance Open MPI point-to-point messaging module
was unable to find any relevant network interfaces:
Module: OpenFabrics (openib)
Host: tocai
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
[tocai:32056] 7 more processes have sent help message help-mpi-btl-base.txt / btl:no-nics
[tocai:32056] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
`/projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_2.fastq.gz': discarded 234529 reads shorter than 191 bases
`/projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_2.fastq.gz': discarded 9783 reads containing non-ACGT characters
`/projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_1.fastq.gz': discarded 197966 reads shorter than 191 bases
Here stdout:
/iga/stratocluster/packages/lib/openmpi/1.6.1/bin/mpirun -np 8 ABYSS-P -k191 -q3 -b1000000 --coverage-hist=coverage.hist -s Medicago_truncatula_k191_b1000000_p0.95_s500-bubbles.fa -o Medicago_truncatula_k191_b1000000_p0.95_s500-1.fa /projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_1.fastq.gz /projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_2.fastq.gz
ABySS 1.5.2
ABYSS-P -k191 -q3 -b1000000 --coverage-hist=coverage.hist -s Medicago_truncatula_k191_b1000000_p0.95_s500-bubbles.fa -o Medicago_truncatula_k191_b1000000_p0.95_s500-1.fa /projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_1.fastq.gz /projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_2.fastq.gz
Running on 8 processors
4: Running on host tocai
0: Running on host tocai
2: Running on host tocai
3: Running on host tocai
6: Running on host tocai
7: Running on host tocai
5: Running on host tocai
1: Running on host tocai
0: Reading `/projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_1.fastq.gz'...
1: Reading `/projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_2.fastq.gz'...
0: Loaded 2077319 k-mer.
4: Loaded 2039873 k-mer.
2: Loaded 2063957 k-mer.
6: Loaded 2062170 k-mer.
7: Loaded 2062877 k-mer.
3: Loaded 2017986 k-mer.
5: Loaded 2058918 k-mer.
1: Loaded 2057399 k-mer.
Loaded 16440499 k-mer. At least 1.18 GB of RAM is required.
Minimum k-mer coverage is 20
Using a coverage threshold of 16...
The median k-mer coverage is 252
The reconstruction is 103086
The k-mer coverage threshold is 15.8745
Setting parameter e (erode) to 16
Setting parameter E (erodeStrand) to 1
Setting parameter c (coverage) to 15.8745
Finding adjacenct k-mer...
I believe I can ignore the warning I had in the standard error. I always got it with ABySS 1.3.7 and it is probably due to my hardware infrastructure.
Standard output always stucks at "Finding adjacenct k-mer..." after computing the coverage.hist file. Memory usage is limited while all 8 CPUs run at 100%
Any hint on what might be the problem?
Thanks in advance,
Simone
Thanks a lot! It worked!
But now I have another problem on later stages while running the following command:
I am trying
abyss-bwamem
and it works!Hmm. Someone else reported that error recently too. It looks like it caused by abyss-map trying to use the 'popcnt' instruction, which isn't available on all processors. No fix available at the moment. The best work around is to use a different aligner (as you have already figured out!)
https://github.com/bcgsc/abyss/blob/master/Common/BitUtil.h#L29
This line is meant to check whether the CPU has a popcnt instruction. It doesn't seem to work on your system. Can you please report the output of
The Xeon E7320 does not have SSE4 and I believe does not support the popcnt instruction.http://www.cpu-upgrade.com/CPUs/Intel/Xeon/E7320.html
ABySS checks the CPUID instruction to see whether the popcnt instruction is supported. https://en.wikipedia.org/wiki/SSE4#POPCNT_and_LZCNT
My best guess is that possibly the compiler is magically inserting a popcnt instruction during optimization, though I would find that very surprising. Did you compile ABySS and are you running ABySS on the same machine? Which compiler did you use? What configure options did you use?
Cheers,
Shaun
Dear Shaun,
I compiled and run ABySS on different machines.
Compiling machines is here:
which looks very similar to the machine I used to run ABySS.
Compiler:
Configure command:
Ah, mystery solved. This bug is fixed in the master branch of ABySS but has not been released. I should have checked this earlier. Sorry for the confusion. You can download an unreleased tarball of ABySS from GitHub here:https://github.com/bcgsc/abyss/archive/master.tar.gzNo, sorry. This fix is in 1.5.2. Just as a sanity check, can you report the output of `abyss-map --version`?
abyss-map (ABySS) 1.5.2
Written by Shaun Jackman.
Copyright 2014 Canada's Michael Smith Genome Sciences Centre
Thanks,
Simone
Please report
grep -m1 flags /proc/cpuinfo
Sorry for the slow progress on this issue. I can't replicate it, so it's tricky to troubleshoot.
Thanks for your help, I really appreciate. No hurry!
There is no popcnt instruction, as I expected, so the question remains to be solved why ABySS is attempting to use an instruction that's not available.