Hello, I'm using VEP tools to annotate human WES data (GRCh37) and as many of us know it provides a prediction for each transcript per row.
- Can the tool (or a script?) provide info for one variant per row, ie including all the transcripts in one cell rather than many rows ?
- can we restrict the HGVS annotations to only known protein (NM id ) and known mRNA (NM id) only ?
I tried using annovar but the HGVS annotations are just not according to the nomenclature for many variants, esp INDELS.
Thank you
Thanks Ben. The current command that I am using is as follows
The output I get still includes nucleotide (NR_) annotations. Can this be excluded as well ? Also, is there a way to annotate the zygosity of each variant in the output ?
Thank you.
No problem- very happy to help.
You could do this using the --transcript_filter option, which uses similar notation and formatting as the filter_vep.
For adding the zygosity, you can use the --individual option, but this only works with VCF files containing individual genotype data: https://www.ensembl.org/info/docs/tools/vep/script/vep_options.html#output
Best wishes
Ben Ensembl Helpdesk
Thanks Ben. I will give this a go.
Hi Ben, I've got VEP working to the desired output. Just one last question. I would like to include only those variants that are
<1% in gnomad_NFE
. Is there an option invep
or do I need to usefilter_vep
?Thank you.