Question: Vcf: How To Indicate Reference Allele Not Found?
5.6 years ago
Nick Stoler
Penn State
Nick Stoler wrote:

In VCF, the ALT column is supposed be where you show what variants you found. Or, as implied in the spec and all the example files I've seen, the ALT column shows all the non-reference variants you found.

But if I'm sequencing a sample and want to use VCF to store its variant calls, I'd like to be able to explicitly say "I did find the REF allele here" and "I did not find the REF allele there."

1) Is it valid to put the REF allele in the ALT column, and
2) regardless of validity, do common tools explode if they encounter this situation?

5.6 years ago
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum wrote:

The VCF spec says:

If there are no alternative alleles, then the missing value ('.') should be used

I noted that, though I wasn't sure how definitively that says "no REF alleles."

So then what is the appropriate format for storing this information? It seems like quite an achilles heel to not be able to list the alleles in your sample unambiguously. (I could use a hack like an INFO column tag or something, but that seems.. hacky.)

Also, any thoughts on tl;dr #2?

5.5 years ago
United States
Adam wrote:

Why not use AF=1.0 in the INFO field? Or define a new INFO flag?

that would make sense...

If you are working with bacteria or any haploid organism you looking at fixed alleles i guess.

If you are looking at diploid or polyploid beasts then you want homozygous SNPs? AF=1 works... so does RO=0, which means I did not find REF allele here.

