Vcf: How To Indicate Reference Allele Not Found?
2
1
Entering edit mode
10.4 years ago
Nick Stoler ▴ 70

In VCF, the ALT column is supposed be where you show what variants you found. Or, as implied in the spec and all the example files I've seen, the ALT column shows all the non-reference variants you found.

But if I'm sequencing a sample and want to use VCF to store its variant calls, I'd like to be able to explicitly say "I did find the REF allele here" and "I did not find the REF allele there."

tl;dr:
1) Is it valid to put the REF allele in the ALT column, and
2) regardless of validity, do common tools explode if they encounter this situation?

vcf variant-calling • 3.8k views
ADD COMMENT
1
Entering edit mode
10.4 years ago

The VCF spec says: http://samtools.github.io/hts-specs/VCFv4.2.pdf

If there are no alternative alleles, then the missing value ('.') should be used

ADD COMMENT
0
Entering edit mode

I noted that, though I wasn't sure how definitively that says "no REF alleles."

So then what is the appropriate format for storing this information? It seems like quite an achilles heel to not be able to list the alleles in your sample unambiguously. (I could use a hack like an INFO column tag or something, but that seems.. hacky.)

Also, any thoughts on tl;dr #2?

ADD REPLY
0
Entering edit mode
10.2 years ago
Adam ★ 1.0k

Why not use AF=1.0 in the INFO field? Or define a new INFO flag?

ADD COMMENT
0
Entering edit mode

that would make sense...

If you are working with bacteria or any haploid organism you looking at fixed alleles i guess.

If you are looking at diploid or polyploid beasts then you want homozygous SNPs? AF=1 works... so does RO=0, which means I did not find REF allele here.

ADD REPLY

Login before adding your answer.

Traffic: 2826 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6