It looks obviously like a file system issue during the DSK kmers counting step (see the traces and the bunch of HDF5 errors). HDF5 seems to have some issues that need to be investigated; issue in HDF5 itself ? issue in HDF5 usage by discosnp ?
It doesn't look like a full disk because there is plenty of free space (disk_current_dir : 157396.8 => 157 GB) with regard to the amount of data to write in tht output HDF5 file (kmers_nb_solid : 2143612212).
Note however the following traces:
max_file_nb : 32768
nb_partitions : 880
The first line tells how many files can be open at the same time. This number is used to compute the "nb_partitions". Since "max_file_nb" is huge (a more classical value is 1024), the "nb_partitions" is huge as well and I think we never tried such high values.
Currently, in order to try to understand the issue, I would suggest two ideas:
- try to limit the "max_file_nb" value. Since it is value set by the operating system, you must be administrator on the machine if you want to decrease it (to 1024 for instance). I think that the "ulimit" shell command does the job.
try to limit the disk usage by using the -max-disk parameter of the dbgh5 command. I'm not sure that DiscoSnp++ knows this option, so you should first try to type something like
/home/cmb-02/sn1/tkitapci/software/DiscoSNP++-2.2.0-Source/build//ext/gatb-core/bin/dbgh5 -in buffalo_fof.txt_removemeplease -out /staging/sn1/tkitapci/NOHA/buffalo_variant_call/Buffalo_k_31_c_auto -kmer-size 31 -abundance-min auto -abundance-max 2147483647 -solidity-kind one -max-disk 50000
With the second solution, you should get a lower value for "nb_partitions" and potentially a bigger value for nb_passes. If the dbgh5 is successful with this parameter, we will have to understand the actual issue.
Can you tell if any of the two suggestions work ? and provide the output as you did before ?