how to speed up SNP calling using bcftools
0
0
Entering edit mode
2.7 years ago

Hi, I was calling SNPs with bcftools pipeline as below:

bcftools mpileup --threads 20 -a AD,DP,SP -Ou -f ${REF} ${BAM}/*.bam | bcftools call --threads 20 -f GQ,GP -mO z -o ./output.vcf.gz

But the option --threads didn't work and this calling has taken about two weeks to output a 71G vcf file.

According to the instruction of bcftools only work for compression of output file when the output type is b or z. I've changed the output type to -Ob, but multiple threads still didn't work either.

So, how can I speed up this calling process?

Thank you

bcftools • 1.7k views
ADD COMMENT
0
Entering edit mode

So, how can I speed up this calling process?

parallelize by regions using --regions-file

ADD REPLY
0
Entering edit mode

Thank you for advice. Can I parallelize by individual and then merge vcf.gz files together?

ADD REPLY
0
Entering edit mode

Can I parallelize by individual

it's a job for gatk HaplotypeCaller in GVCF mode.

ADD REPLY
0
Entering edit mode

The --threads argument to bcftools call is useless IMO - it is the number of threads used for compression, not used across the board when the output is set to be compressed.

ADD REPLY

Login before adding your answer.

Traffic: 1663 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6