I recently installed AByss version 1.5.2
The installation did not produce any errors and executables were correctly produced.
Running it on a small dataset it takes forever.
With version 1.3.7 on the same dataset it runs with 8 processors in about ten minutes. With version 1.5.2 after 15 hours it is still running, same parameters, including k=71. I also tried different kmer size with similar problem.
I installed it using the following commands:
./configure --with-mpi=${LIB}/lib/openmpi/1.6.1 --with-boost=${LIB}/lib/boost/1.55/include CPPFLAGS=-I${LIB}/lib/sparsehash/2.0.2/include --enable-maxk=128
The command I issued is (I have logs for k=191 and installation k=256 only right now):
abyss-pe k=191 l=1 verbose=-v aligner=map b=1000000 p=0.95 s=500 np=8 n=10 name='Medicago_truncatula_k191_b1000000_p0.95_s500' lib='lib1' lib1='${PATH}/10B23_S36_L001_clean_1.fastq.gz ${PATH}/10B23_S36_L001_clean_2.fastq.gz' Medicago_truncatula_k191_b1000000_p0.95_s500-1.fa >Medicago_truncatula_k191_b1000000_p0.95_s500.std 2>Medicago_truncatula_k191_b1000000_p0.95_s500.err
Here stderr:
--------------------------------------------------------------------------
[[21301,1],1]: A high-performance Open MPI point-to-point messaging module
was unable to find any relevant network interfaces:
Module: OpenFabrics (openib)
Host: tocai
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
[tocai:32056] 7 more processes have sent help message help-mpi-btl-base.txt / btl:no-nics
[tocai:32056] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
`/projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_2.fastq.gz': discarded 234529 reads shorter than 191 bases
`/projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_2.fastq.gz': discarded 9783 reads containing non-ACGT characters
`/projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_1.fastq.gz': discarded 197966 reads shorter than 191 bases
Here stdout:
/iga/stratocluster/packages/lib/openmpi/1.6.1/bin/mpirun -np 8 ABYSS-P -k191 -q3 -b1000000 --coverage-hist=coverage.hist -s Medicago_truncatula_k191_b1000000_p0.95_s500-bubbles.fa -o Medicago_truncatula_k191_b1000000_p0.95_s500-1.fa /projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_1.fastq.gz /projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_2.fastq.gz
ABySS 1.5.2
ABYSS-P -k191 -q3 -b1000000 --coverage-hist=coverage.hist -s Medicago_truncatula_k191_b1000000_p0.95_s500-bubbles.fa -o Medicago_truncatula_k191_b1000000_p0.95_s500-1.fa /projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_1.fastq.gz /projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_2.fastq.gz
Running on 8 processors
4: Running on host tocai
0: Running on host tocai
2: Running on host tocai
3: Running on host tocai
6: Running on host tocai
7: Running on host tocai
5: Running on host tocai
1: Running on host tocai
0: Reading `/projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_1.fastq.gz'...
1: Reading `/projects/igats/DNA-seq/2015/2015-02_Medicago_truncatula-Calderini/lanes/sequences/trimmed/10B23_S36_L001_clean_2.fastq.gz'...
0: Loaded 2077319 k-mer.
4: Loaded 2039873 k-mer.
2: Loaded 2063957 k-mer.
6: Loaded 2062170 k-mer.
7: Loaded 2062877 k-mer.
3: Loaded 2017986 k-mer.
5: Loaded 2058918 k-mer.
1: Loaded 2057399 k-mer.
Loaded 16440499 k-mer. At least 1.18 GB of RAM is required.
Minimum k-mer coverage is 20
Using a coverage threshold of 16...
The median k-mer coverage is 252
The reconstruction is 103086
The k-mer coverage threshold is 15.8745
Setting parameter e (erode) to 16
Setting parameter E (erodeStrand) to 1
Setting parameter c (coverage) to 15.8745
Finding adjacenct k-mer...
I believe I can ignore the warning I had in the standard error. I always got it with ABySS 1.3.7 and it is probably due to my hardware infrastructure.
Standard output always stucks at "Finding adjacenct k-mer..." after computing the coverage.hist file. Memory usage is limited while all 8 CPUs run at 100%
Any hint on what might be the problem?
Thanks in advance,
Simone