Question: errors running kissplice during de-Bruijn graph
0
gravatar for erica
9 months ago by
erica10
erica10 wrote:

Hi,

I've been trying to run kissplice on my institution's cluster. I run the following script:

kissplice -t 6 -s 1 -k 51 -v -o kiss_results3 --experimental --max-memory 60000 -r ............

and get a '179971 Bus error', suggesting a memory issue. I've also tried running without --max-memory flag (to let memory be unlimited), but get this error:

Problem with /nesi/nobackup/nesi00431/miniconda_envs/kisspl/libexec/kissplice/ks_debruijn4

Here is the log output for the second script (I've removed path and file info with xxx):

This is KisSplice, version 2.4.0-p1 ~ The command line was:
xxxx/miniconda_envs/kisspl/bin/kissplice -t 6 -s 1 -k 51 -v -o kiss_results3 --experimental -r xxxx Using the read files: xxxx Results will be stored in: /scale_wlg_nobackup/filesets/nobackup/nesi00431/kiss_results3 Summary log file will be saved in: /scale_wlg_nobackup/filesets/nobackup/nesi00431/kiss_results3/kissplice_log_summary_21-47-53_17-12-2018_231354

[21:47:53 17/12/2018] --> Building de Bruijn graph... Graph will be written in kiss_results3/graph_F22merged_trimmedQ20_R1_F22merged_trimmedQ20_R2_F24merged_trimmedQ20_R1_F24merged_trimmedQ20_R2_F25merged_trimmedQ20_R1_F25merged_trimmedQ20_R2_F26merged_trimmedQ20_R1_F26merged_trimmedQ20_R2_F29mergek51.[edges/nodes]

We can successfully run the program with 2 data files, suggesting its a memory issue. I have a large-ish dataset of 14 total data files, of about 600MB each.

Any help would be greatly appreciated.

kissplice rna-seq snp • 331 views
ADD COMMENTlink modified 8 months ago by leandro.ishi.lima90 • written 9 months ago by erica10

Dear Erica,

to check if it is a memory error, could you please paste the output of the following commands.

To retrieve the amount of memory in your machine:

free -h

To get the maximum resident set size used by a command:

/usr/bin/time --verbose <your_command>

Where <your_command> is your KisSplice command. After <your_command> finishes executing, time --verbose will output some log information that might help us debug. As a concrete example, we need the information that looks like this:

    Command being timed: "ls"
User time (seconds): 0.00
System time (seconds): 0.00
Percent of CPU this job got: 0%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.00
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 2540
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 119
Voluntary context switches: 1
Involuntary context switches: 4
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0

Thanks for the report!

Kind regards.

ADD REPLYlink written 9 months ago by leandro.ishi.lima90

May thanks for the response.

We're running this on a national computing cluster, with fairly substantial memory available. The most we've tried allocating thus far was 105GB and 36 cores. This ran but stopped without reporting an obvious error. However, I see this error buried in the output:

^MLooping through branching kmer n° 276197100 / 276236278     ^MLooping through branching kme**Problem with /nesi/nobackup/nesi00431/miniconda_envs/kisspl/libexec/kissplice/ks_debruijn4**r n° 276197400 / 276236278     ^MLooping through branching kmer n° 276197700 / 2
    76236278     ^MLooping through branching kmer n° 276198000 / 276236278    
    -------------------nodes construction time Wallclock  1370.73 s

Others runs have exited with this same error, but without proceeding into the analysis.
Here's the time report on that run:
    Command exited with non-zero status 247
    User time (seconds): 12789.24
    System time (seconds): 535.01
    Percent of CPU this job got: 100%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 3:41:58
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 49623228
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 0
    Minor (reclaiming a frame) page faults: 111321692
    Voluntary context switches: 1853865
    Involuntary context switches: 13397
    Swaps: 0
    File system inputs: 0
    File system outputs: 0
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
Exit status: 247

Any insights into the problem would be greatly appreciated.

ADD REPLYlink modified 8 months ago by genomax71k • written 9 months ago by erica10

Thanks again for the quick response. Happy to say we were able to run KisSplice to completion by installing on a different server. Seems likely something in the configuration of the previous server may have interfered with memory allocation at some stage, despite requesting sufficient memory in the script.

Thanks again.

ADD REPLYlink written 8 months ago by erica10

Hello,

happy to hear it worked out for you!

Kind regards.

ADD REPLYlink written 8 months ago by leandro.ishi.lima90
0
gravatar for leandro.ishi.lima
8 months ago by
leandro.ishi.lima90 wrote:

Dear Erica,

The log shows that KisSplice failed when building the de Bruijn graph. Such errors are usually memory-related, but since this is the first time we have this specific error, we can't be sure. Your output also shows that KisSplice used 47.3 Gb of RAM before crashing. KisSplice failing with the 247 exit code is new for us, so it is a little bit cryptic to debug since we can't replicate the issue. Are you using some kind of container or environment? Although the cluster has fairly substantial memory available, is it possible that KisSplice is going over the allowed maximum memory for your process? I found an user with the same exit code as you when using GATK: https://gatkforums.broadinstitute.org/gatk/discussion/10420/getbayesianhetcoverage-exits-with-code-247 in a Docker container , so I am wondering if this could also be your problem. Would it be possible for you to run KisSplice with ~150 Gb of available RAM to check if this is the issue?

Kind regards.

ADD COMMENTlink written 8 months ago by leandro.ishi.lima90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1153 users visited in the last hour