Adding consequences field (INFO) when using --allow_non_variant in VEP
1
0
Entering edit mode
4.2 years ago
magnolia ▴ 20

Hi,

I'm annotating my VCF using VEP. By default, VEP generates an output with variants only. So positions with no variation are not reported. I can add non-variant positions by adding --allow_non_variant tag but I cannot get INFO section for these positions. I want to have at least the gene HGNC name/symbol for those positions. Is it possible?

Thank you for your answers!

vep vcf annotation • 1.3k views
ADD COMMENT
0
Entering edit mode

Do you mean you already have HGNC IDs in the INFO field that are being omitted by VEP or that you'd like VEP to add HGNC symbols to all positions including non-variant positions?

ADD REPLY
0
Entering edit mode

I would like to add HGNC symbols to all positions including non-variant positions. This is the most crucial one but if possible, whatever custom database I give to VEP, I want to see the matched INFO in every position.

ADD REPLY
0
Entering edit mode
4.2 years ago
Emily 23k

The INFO column fills in consequences of the variant on known gene. If there is no variant, there is no consequence so nothing to fill in. All --allow_no_variant does is keep it in the VCF output.

ADD COMMENT
0
Entering edit mode

Thank you for the explanation. I know that the whole point of CSQ section is consequences of the variant. I just need to add information to non-variant positions as well. So I'm guessing it's impossible with VEP?

ADD REPLY
0
Entering edit mode

Why do you need HGNC information at reference locations? The ideal workflow is to start by looking at just non-reference loci, so annotating reference loci will just bloat your VCF.

ADD REPLY
0
Entering edit mode

Because I also want to see what positions are covered in the VCF and which genes are in those positions.

For example, let's say I have a VCF file that contains 10 positions for BRCA1 but only one of them is a variant. Since I cannot keep every single gene's location in the genome in my mind, when I filter for BRCA1, I will get that 1 variant but I also want to see which other positions are covered for BRCA1 in the VCF.

ADD REPLY
0
Entering edit mode

Your VCF should not contain all-ref positions. Ideally, VCFs only contain positions that are altered in at least one sample in the VCF. How do you even have all-ref positions? Are you using a gVCF?

ADD REPLY
0
Entering edit mode

I can just generate synthetic data in, lets say, tsv/txt and then convert it to VCF and then load into VEP. This part, in my opinion, doesn't matter. As VEP being variant effect predictor, I guess there is no flexibility for that. I cannot get any information for a position that is not a variant.

But since you mentioned, gVCF, what if I'm using that then? Is there a workaround?

ADD REPLY
0
Entering edit mode

Not really, a gVCF has variant loci and non-variant "blocks". Your best bet is to use bcftools and a custom BED file with gene coordinates (or a regular GTF file) to do your own annotation.

ADD REPLY

Login before adding your answer.

Traffic: 1878 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6