Using VEP annotation output as the input for a second VEP annotation
0
0
Entering edit mode
12 weeks ago
Arton • 0

I need to perform two rounds of annotations using VEP. I cannot merge the steps into one for specific reasons. The first annotation runs successfully. However, the second annotation runs to completion, but the annotations from the first vep don't have any values (columns have "-").

  • If I generate a VCF file instead of tsv file then the data from the first annotation will be in the output. But then I face an issue with the next step which is filtering for "CANONICAL =YES" which seems to only work on tsv files. I've included my code below. I would appreciate your comments and suggestions.

raw.vcf ---> VEP --> anno1.vcf --> VEP --> anno2.tsv

vep --input_file test.vcf \
     --fork 4 \
     --fasta ${GENOME_FASTA} \
     --show_ref_allele \
     --vcf \
     --pick --pick_order canonical,mane_select,mane_plus_clinical \
     --fields example1,example2,example3 \
     --assembly GRCh38 \
     --exclude_predicted \
     --force_overwrite \
     --offline \
     --everything \
     --coding_only \
     --cache --dir_cache $CACHE \
     --output_file test_anno1.vcf \
     --custom anno.vcf.gz,anno,vcf,exact,1,example1,example2,example3 

vep --input_file test_anno1.vcf \
     --fork 4 \
     --fasta ${GENOME_FASTA} \
     --show_ref_allele \
     --keep_csq \
     --vcf_info_field NEW \
     --canonical \
     --fields example1,example2,example3,example4,example5 \
     --tab \
     --assembly GRCh38 \
     --force_overwrite \
     --offline \
     --coding_only \
     --cache --dir_cache $CACHE \
     --output_file test_anno1_anno2.tsv \
     --custom anno2.vcf.gz,anno2,vcf,exact,1,example4,example5
VCF VEP Annotation • 500 views
ADD COMMENT
0
Entering edit mode

filtering for "CANONICAL =YES" which seems to only work on tsv files

How did you come to this conclusion?

ADD REPLY
0
Entering edit mode

I just managed to use the filter on VCF file and it did work. Now I have another issue that the filter for the second annotation does not remove any variants even though the annotation is in the VCF file. I attached one of the info section of my VCF. The very last annotation is the one I need to filter and it's not working.

--filter "example5 < 3 or not example5"
CSQ=A|synonymous_variant|MICAL3|ENSG00000243156|Transcript|ENST00000441493|protein_coding|G||SNV|YES|NM_015241.3||0.0002437|0.0002663|0.0005631|0|0.0005418|0|0.000104|0.0004677|0.0002176|0.0002306|9.679e-05|0|0.0005895|0|0.001358|0|0|0.0001473|0.0004808|0.0008358|0.001358|gnomADg_EAS|chr22:17818224-17818224|0.000218971|0.000124131|0.00051087|0|0.000613987|5.78369e-05|0.000248385|0.000112324|0.00039497|0.000583506|chr22:17818224-17818224|0.000236957|9.65065e-05|0|0.000654193|0|0.00136134|0|0|0.000147306|0.00083647|0.000475737|||;NEW=A|synonymous_variant|MICAL3|ENSG00000243156|Transcript|ENST00000441493|protein_coding|G|||YES|||||||||||||||||||||||||||||||||||||||||||||||||||22:17818224-17818224|4
ADD REPLY
0
Entering edit mode

What command are you passing the --filter flag to, and how did you come up with example5 as the field name for the last field?

ADD REPLY
0
Entering edit mode

The command is correct because my pipeline works when I merge the annotations in one command but when I split it to two arounds of annotations then the filter stops working. Also I'm able to filter based on the first annotations (example1 or 2 or 3) but filtering based on second annotations doesn't work. The field name I'm using is not the one I posted here. I just renamed it for privacy purposes.

filter_vep -i INPUT.vcf --filter "example5 < 3 or not example5" --force_overwrite -O OUTPUT.vcf
ADD REPLY
0
Entering edit mode

I added --vcf_info_field NEW to the filter command and it worked.

ADD REPLY
0
Entering edit mode

Good call. This is exactly where my brain went.

ADD REPLY
0
Entering edit mode

Arton why did you delete this post?

ADD REPLY

Login before adding your answer.

Traffic: 1806 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6