Variant position showed up in mpileup but not in VCF call
1
1
Entering edit mode
3.3 years ago
leemhmark ▴ 10

Hello,

I'm doing consensus calling using bcftools. The command I'm using is:

bcftools mpileup --max-depth 0 --ff DUP -f [fa file] [deduped bam file] | bcftools call -mv -Ov -o [variants.vcf]

I looked at the intermediate mpileup output created using bcftools mpileup --max-depth 0 --ff DUP -f [fa file] [deduped bam file] > [mpileup.txt] and there's a variant found at position 1703:

gb-cds|KM069571.1|Influenza_virus|HA|A|H3|A/Singapore/C2012.669/2012|   1703    .   T   C,A,<*> 0   .   DP=40;I16=13,21,1,1,1043,39031,72,2592,1220,45432,86,3700,802,19468,30,650;
QS=0.927565,0.0362173,0.0362173,0;
VDB=0.02;
SGB=-0.453602;
RPB=0.132353;
MQB=0.220588;
MQSB=0.993571;
BQB=0.558824;
MQ0F=0  PL  0,69,255,69,255,255,102,255,255,255

But in the resulting .VCF file I don't see this variant showing up anywhere. Is it because the C and A variants are too infrequent or low quality? If so how do I interpret all the fields in the mpileup entry?

Thanks!

snp next-gen gene • 891 views
ADD COMMENT
0
Entering edit mode
3.3 years ago

Well, there are 3 variants found at the same position. What are the individual allelic depths? If needed, view the region in IGV, as this can usually make it easier to understand why a call is not made

ADD COMMENT
0
Entering edit mode

So I checked the region in IGV and the allelic depths are: Total count: 1263 A: 5 reads (0%, 1+, 4-) C: 19 reads (2%, 4+, 15-) T: 1239 reads (98%, 148+, 1091-)

Is 2% considered too low a depth for bcftools to consider it as a variant? I tried to look for this cutoff depth threshold used by bcftools and didn't have any luck finding this information.

ADD REPLY
0
Entering edit mode

Yes, that is very low - BCFtools is best at calling germline variants, i.e., at frequencies of 50% or 100%. You can still call them - take a look at the varFilter executable that can be used with BCFtools and that I use here: https://github.com/kevinblighe/ClinicalGradeDNAseq/blob/master/AnalysisMasterVersion1.sh#L354-L375

If these are cancer samples, then you should obviously use a somatic variant caller.

ADD REPLY

Login before adding your answer.

Traffic: 1945 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6