How Are Indel Coded In Vcf Files
2
1
Entering edit mode
11.8 years ago
Charlotta ▴ 10

Hello,

I have a question concerning how are INDEL coded in VCF files. I have for instance the following file:

CHROM POS ID REF ALT QUAL FILTER

8 18078835 . TTA T 46 PASS

8 18078836 . TA T 138 PASS

How to interpret it as the two positions are following each other (835 and 836)?

And also, how to interpret the QUAL (I thought 100 was the maximum value...)?

Many thanks

Christelle

indel vcf • 5.1k views
ADD COMMENT
2
Entering edit mode
11.8 years ago

The first is a deletion of TA and the second is a deletion of A (chr8:g.18078836_18078837del and chr8:g.18078837del). Possible explanations could be that they are heterozygous and on different alleles, or from different samples (I cannot see from the data you show). On of them could also be due to bad alignment.

QUAL is -10log_10 prob(call in ALT is wrong), so it can very well be above 100.

ADD COMMENT
0
Entering edit mode

Thank you for your answer. In fact that's the genotype of an individual:

8 18078835 . TTA T 46 PASS 0|0

8 18078836 . TA T 138 PASS 0|0

Does it mean that the individual is : TTA | TTA (and the pos 18078837 is A?)

but then this individual:

1|0

1|0

is also TTA | TTA ?

christelle

ADD REPLY
0
Entering edit mode
11.8 years ago

I would examine the reads themselves, or likke in IGV, there's probably some kind of indel there, but it might not be described accurately in the vcf. There might be misalignment of some of the reads, causing your software to think that those misaligned reads indicate the presense of another allele. LIkely the higher quality variant is real, and the other is not.

ADD COMMENT

Login before adding your answer.

Traffic: 2360 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6