Hello,
I am trying to run a variant call using bcftools on a yeast (haploid) DNAseq .bam file. Using the following command I am able to generate a list of variants, but scanning through the .bam file on IGV, I can see variants that appear to be of high quality (100+ read coverage, good cigar scores on individual reads in the read track, etc.) that are not being called using this command.
bcftools mpileup -Ou -f sacCer3.fa -q 10 -d 250 myfile.bam | bcftools call -Ou -mv --ploidy 1 | bcftools filter -i 'DP>=50' | bcftools norm -f sacCer3.fa -Oz -o myfile.vcf.gz
Reading other posts, I have loosened the quality criteria (-q 10) and these variants are still not being called.
Is there a way to interrogate the .bam file to ensure that it is meeting calling criteria at a particular position for bcftools? Any other suggestions?
bcftools 1.19
Thanks, Ben
Thank you for your response. Here is the adjusted code I ran:
It appears the variant is now being called with the revised code - thank you! However, the issue I am having is that I am trying to compare two bam files to identify unique variants. When I run the following command to compare variants:
many of the sites on this list appear to be calling differences where there don't appear to be any (as viewed in IGV). It seems this is because one of the vcf files will generate two calls. As an example, for position chrI:1469, calling file 1:
whereas calling file 2:
Given many more reads support the reference (DP4=99,71,0,1), why does bcftools generate this potential variant? Is there a way to screen these lower quality calls out before running my comparison?
Thank you
this is another question.
you have a bias with the forward/reverse strand