Question: Is Pindel slow for everyone, or should I review my command?
0
gravatar for Macspider
20 months ago by
Macspider2.5k
Vienna - BOKU
Macspider2.5k wrote:

Hi all,

It has been almost 1 month since I started five pindel2vcf runs to convert the output of Pindel, which took more than 1 month itself to finish. I am using it on whole-genome results, so the amount of data is considerably high. However, I did not find anywhere written that the program is not meant to be used for whole-genome analyses.

My command was:

time { pindel -f reference.fa --config-file filename.config --output-prefix whatever --chromosome ALL --number_of_threads 12 --max_range_index 4 --report_inversions --report_duplications --report_long_insertions --report_breakpoints --report_close_mapped_reads --min_inversion_size 50 &> STDERR/pindel.stderr; } &> TIME/pindel.time & disown

Except for the 12 threads (couldn't help it), is it improvable in speed by adding something that I am not aware of? Am I wrong on using it for whole-genome analyses?

My pindel2vcf command was:

time { pindel2vcf -p pindel_output_file -r reference.fa -R name_and_version -d date -v FINAL/deletions.vcf -mc 10 -he 0.2 -ho 0.8 --both_strands_supported --min_supporting_reads 4 --max_supporting_reads 50 &> STDERR/deletions.vcf.stderr; } &> TIME/deletions.vcf.time & disown

Is this also improvable?

ADD COMMENTlink modified 20 months ago • written 20 months ago by Macspider2.5k

How large is the pindel output file? Does your CPU support at least 12 parallel threads? Did you have free RAM at all times? If you processed the chromosomes individually, then you could have pindel2vcf'd output files in parallel. I've never used pindel so I can't really comment about your command line arguments..

ADD REPLYlink modified 20 months ago • written 20 months ago by 5heikki7.6k

The resources are not a problem, I'm working on a quite big cluster with many cores and a lot of memory always available, I think the problem is more related to Pindel itself.

ADD REPLYlink written 20 months ago by Macspider2.5k

I am having the same problem with Pindel. How did you sort it out?

ADD REPLYlink written 8 months ago by chevivien70

I pre-selected the reads that could have been generating something. For example: I knew that I was looking for a rearrangement on one scaffold so I pre-selected the reads mapping on that one and the ones that had one mate on that and another mate on a different scaffold.

However, since they claim that you could do whole-genome rearrangement studies, what I did was a workaround. You can't always know a priori what you're looking for and where.

ADD REPLYlink written 8 months ago by Macspider2.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1289 users visited in the last hour