Question: bcftools call transforms the mpileup -3aag variant into a TAAGAAGA>TAAGA variant
3.4 years ago
Michi940 wrote:

Hi, from a previous run of my sample, I know that at the position in question there is a TAAG>T variant (deletion of AAG), which is also clearly visible in IGV. 

A samtools mpileup confirms this:

1    225685621    T    10    ,,,-3aag,-3aag,-3aag,-3aag,,-3aag,,-3aag    CFFG7GGWGG    34    <<<<<<<>>><<<<<><>><>>>><<><<<>>><    CFHHJJJHFIIEJJGJJJJJJIJJJFCFFEJJHD

But, and here is my issue, the bcftoolsc call, or bcftools view command turns this variant into the following, which disagrees with the previous outputs

1    225685621    .    TAAGAAGA    TAAGA    174    .    INDEL;IDV=7;IMF=0.636364;DP=11;VDB=0.0633369;SGB=-0.616816;MQ0F=0;ICB=1;HOB=0.5;AC=1;AN=2;DP4=0,2,0,6;MQ=60    GT:PL:DP:DV:DP4    0/1:208,0,87:8:6:0,2,0,6


Thus, I'd like to understand why this is happening, and how I could prevent that behaviour. Is it possible that this is a bug?

Thanks for any help,



I could solve the problem by normalizing the call with piping the bcftools call output to:

bcftools norm -f genome.fa -

But still, I do not understand the reperesentation given above..




written 3.4 years ago by Michi940
3.4 years ago
Devon Ryan
Freiburg, Germany
Devon Ryan wrote:

The outputs are in complete agreement, they both say that AAG is deleted, they just represent that in different ways.

written 3.4 years ago by Devon Ryan

Ha, I was really hoping that was the answer, but I cannot see it. 

If you remove TAAGA from TAAGAAGA , the result is AGA  as a difference, ergo the deletion.. Now I assume I am wrong with my argument, so could you please elaborate? Thank you

written 3.4 years ago by Michi940

TAAGA isn't what's removed, it's the result of the change (i.e., we start with TAAGAAGA, something happens, and we then see TAAGA). VCF files will represent the states observed, not a reference and then change.

written 3.4 years ago by Devon Ryan
