Variant calling lost variant
0
0
Entering edit mode
18 months ago

Hi everyone! I'm carrying out variant calling using bcftools. First, I created .mpileup file using bcftools mpileup, then run bcftools calls for variant calling. In some positions .mpileup file contains information about both alleles of diploid heterozygous (e.g. A,T while reference only T), but when I run bcftools calls this information erased and only one allele remained.

Also, I used IGV-browser, it showed that it is heterozygous, but final .vcf file does not contain both alleles, only one of them. See the attachments. Who knows why it is happening, I tried to use most arguments of bcftools call, but the result is the same. By the way, it only happened in a bunch of positions, most positions worked fine as expected.

Picture 1: Interested position circled in blue

Picture 2: dac.calls.vcf contains only G, but must contain both A and G

bcftools samtools variant-calling alignment • 1.1k views
ADD COMMENT
0
Entering edit mode

you want to have a VCF with ALT alleles A and T ? but T looks like a small sequencing error to me....

ADD REPLY
0
Entering edit mode

no, I want to have A and G (gray, not highlighted). REF is G and here I want to have both A, G.

ADD REPLY
0
Entering edit mode

I still don't understand. The G is the REF allele. That's the 4th column of your VCF. It cannot be in the ALT column.

ADD REPLY
0
Entering edit mode

This is a weird task, I want to write both A and G in this column, and bcftools allows you to do this, just not to add -v (variant-only) flag. In most cases it worked fine, but some cases like this don't show both alleles in ALT column. May be something in INFO column, I can not understand

ADD REPLY
1
Entering edit mode

you can't have G in the ALT column if G is the REF allele. Your problem sounds like a XY problem: https://xyproblem.info/

ADD REPLY
0
Entering edit mode

Ok, thank you! Now I see I was mistaken, I thought about it because it is even called VARIANT calling, but G here is not a variant. May be you know how to collect both G and A? I think it is not a variant calling at all, but I think manipulating .mpileup we can get it?

ADD REPLY
1
Entering edit mode

You need to parse the VCF file and do the combination yourself. Pay attention to the REF and ALT (not "ALT" not "VARIANT"!) columns, along with Genotype GT.

Here we'll have REF=G, ALT=A (alternate allele) and GT=0/1 indicating one allele is REF and the other is ALT. If it was homozygous it'd be GT=1/1. If we had REF=G and our alingment showing A/T as the two alleles then ALT=A,T and GT=1/2.

While it's maybe not the best format on the planet, to put it mildly, VCF is a standard and bcftools is simply following that.

ADD REPLY

Login before adding your answer.

Traffic: 2544 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6