Why VEP annotates non-variants (homozygous reference)?
1
0
Entering edit mode
3.2 years ago
magnolia ▴ 20

Hi,

In my VCF, even though genotype is 0/0, VEP annotates it.

If I provide the sample location in TSV format, it doesn't annotate at all.

How can I stop annotating non-variant locations when I provide VCF?

vep ensembl SNP • 1.1k views
ADD COMMENT
0
Entering edit mode

In my VCF, even though genotype is 0/0, VEP annotates it

I don't think VEP looks at sample genotype to perform locus annotations

If I provide the sample location in TSV format, it doesn't annotate at all

That's odd. Can you show us the VEP commands (if you're running command line VEP, you're building these commands. If you're using Web VEP, it should show you the equivalent command in the results)

How can I stop annotating non-variant locations when I provide VCF?

Maybe filter the VCF to exclude all HOM-REF loci before annotating wiith VEP?

ADD REPLY
0
Entering edit mode

Thank you for replies.

I understand if VEP doesn't care about genotype. Thought then it would be too complicated with multiple alternative alleles. I can also filter HOM-REF variants before VEP like you said.

I run my test again with sample data. When data is HomZ, VEP doesn't annotate. When it's HetZ, it annotates.

My VEP command

vep -i $path2file --offline --fork 8 -o $path2out --tab --cache --dir_cache $path2cache --symbol

HomZ test sample as TSV

1   45795027    45795027    C/C

HetZ test sample as TSV

1   45795027    45795027    C/G
ADD REPLY
0
Entering edit mode

You see how your REF = ALT when you supply a TSV? As VCF, REF would not be the same as ALT, GT would be 0/0. I think that's why VEP annotates the VCF but not the TSV.

ADD REPLY
0
Entering edit mode

But then how come VEP annotates 0/0 in VCF I supply?

By the way, I temporarily solved my problem by using --individual. From the documentation:

Consider only alternate alleles present in the genotypes of the specified individual(s).

I'm still curious about VEP annotating 0/0 genotypes though. It seems like GT I provide doesn't matter at all.

ADD REPLY
0
Entering edit mode

AFAIK, VCF has different values for REF and ALT no matter the GT you use, which is not the case with TSV. VEP annotates using site specific data by default, and your TSV changes at the site level for HomRef samples, whereas the VCF does not.

For example, what are the first 5 columns in your VCF for 1:45795027?

ADD REPLY
0
Entering edit mode

Thank you for responses. Yes, VEP doesn't care about the genotypes in VCF.

ADD REPLY
3
Entering edit mode
3.2 years ago
Emily 23k

The VEP doesn't read the genotypes columns in the VCF. It only reads the position, ref and alt.

ADD COMMENT

Login before adding your answer.

Traffic: 2254 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6