Question: VCF Indel line structure: what is actually reported?
0
gravatar for Macspider
2.0 years ago by
Macspider2.8k
Vienna - BOKU
Macspider2.8k wrote:

Hi folks,

I am digging in the deepest of variant calling this year, and I stepped on a weird case on my VCF file. INDEL lines usually report reference position and reference allele indicating the nucleotide which is before the indel.

Example: If I have an AAG insertion at position 5 of my scaffold, I will get reported a VCF line like:

Chrom       Pos Tag Ref Alt 
Scaffold    4   .   G   GAAG     ...(etc)

What happens in my file is:

Chrom       Pos Tag Ref Alt 
Scaffold    4   .   TAG TAGAAG     ...(etc)

Not that in the second case the "TA" before the "G" are also included. I checked and these bases are part of the reference and part of 95% of the reads that map there, same reads that call the subsequent indel.

What is happening? Why is bcftools call reporting also those ones into the reference and alternative allele of the indel?

snp variant calling vcf indel • 679 views
ADD COMMENTlink modified 22 months ago by Biostar ♦♦ 20 • written 2.0 years ago by Macspider2.8k

Is this position multiallelic (at least a SNP?). Which version of VCF?

ADD REPLYlink written 2.0 years ago by Santosh Anand4.7k

VCF 4.2, position is not multiallelic.

ADD REPLYlink modified 2.0 years ago • written 2.0 years ago by Macspider2.8k
1
gravatar for Pierre Lindenbaum
2.0 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum119k wrote:

Depends of your version of samtools/bcftools but AFAIK, the INDEL are sometimes poorly reported. There are some tools to left align the variants:

ADD COMMENTlink modified 2.0 years ago • written 2.0 years ago by Pierre Lindenbaum119k

I'm making my tool myself, hence this question haha. Thank you for the links! Will check.

ADD REPLYlink written 2.0 years ago by Macspider2.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1180 users visited in the last hour