Entering edit mode
2.6 years ago
Chenfei Zheng
▴
60
Hi, I was calling SNPs with bcftools pipeline as below:
bcftools mpileup --threads 20 -a AD,DP,SP -Ou -f ${REF} ${BAM}/*.bam | bcftools call --threads 20 -f GQ,GP -mO z -o ./output.vcf.gz
But the option --threads
didn't work and this calling has taken about two weeks to output a 71G vcf file.
According to the instruction of bcftools only work for compression of output file when the output type is b
or z
. I've changed the output type to -Ob
, but multiple threads still didn't work either.
So, how can I speed up this calling process?
Thank you
parallelize by regions using
--regions-file
Thank you for advice. Can I parallelize by individual and then merge vcf.gz files together?
it's a job for gatk HaplotypeCaller in GVCF mode.
The
--threads
argument tobcftools call
is useless IMO - it is the number of threads used for compression, not used across the board when the output is set to be compressed.