Question

VEP tools output

0

Entering edit mode

6.7 years ago

NB ▴ 960

Hello, I'm using VEP tools to annotate human WES data (GRCh37) and as many of us know it provides a prediction for each transcript per row.

Can the tool (or a script?) provide info for one variant per row, ie including all the transcripts in one cell rather than many rows ?
can we restrict the HGVS annotations to only known protein (NM id ) and known mRNA (NM id) only ?

I tried using annovar but the HGVS annotations are just not according to the nomenclature for many variants, esp INDELS.

Thank you

vep ensembl output • 4.2k views

ADD COMMENT • link updated 6.7 years ago by Ben Moore ★ 2.4k • written 6.7 years ago by NB ▴ 960

1

Entering edit mode

6.7 years ago

Ben Moore ★ 2.4k

Hi Nandini,

I'm glad to hear that. You will need to use filter_vep using the following guidelines: http://www.ensembl.org/info/docs/tools/vep/script/vep_filter.html#filter_write

Best wishes

Ben Ensembl Helpdesk

ADD COMMENT • link 6.7 years ago by Ben Moore ★ 2.4k

0

Entering edit mode

thanks Ben. I'm running into the following error

-------------------- EXCEPTION --------------------
MSG: 
ERROR: Forked process(es) died: read-through of cross-process communication detected

STACK Bio::EnsEMBL::VEP::Runner::_forked_buffer_to_output /Software/ensembl-vep/modules/Bio/EnsEMBL/VEP/Runner.pm:554
STACK Bio::EnsEMBL::VEP::Runner::next_output_line /Software/ensembl-vep/modules/Bio/EnsEMBL/VEP/Runner.pm:361
STACK Bio::EnsEMBL::VEP::Runner::run /Software/ensembl-vep/modules/Bio/EnsEMBL/VEP/Runner.pm:202
STACK toplevel ./vep:222
Date (localtime)    = Wed Mar  7 10:27:19 2018
Ensembl API version = 91

the command being used is

./vep --cache --dir_cache /Software/ensembl-vep/.vep  --stats_text S3.html --refseq --everything --individual all --transcript_filter "stable_id match N[M]_" --fork 4 -tab --custom /Software/ensembl-vep/.vep/score.bed.gz,score,bed,exact,0 --custom Software/ensembl-vep/.vep/HEX.bed.gz,HEX,bed,exact,0  --port 3337 -i S3.recode.vcf -o S3.txt

Any idea what might be going wrong ? There is only one sample in this vcf with approx 5000 variants. thanks.

ADD REPLY • link 6.7 years ago by NB ▴ 960

0

Entering edit mode

Hi Nandini,

My colleagues have said that they are currently helping you on GitHub: https://github.com/Ensembl/ensembl-vep/issues/150#issuecomment-371137459

For this error, it's best that they help you.

Best wishes

Ben Ensembl Helpdesk

ADD REPLY • link 6.7 years ago by Ben Moore ★ 2.4k

0

Entering edit mode

yes that's correct. Thanks Ben

ADD REPLY • link 6.7 years ago by NB ▴ 960

score 2 · Accepted Answer · 2018-02-28

2

Entering edit mode

6.7 years ago

Ben Moore ★ 2.4k

Hi Nandini,

This is not possible using either the web interface or the standalone script. However, you can use a number of different filtering options, including --pick: http://www.ensembl.org/info/docs/tools/vep/script/vep_options.html#filt

or the filter_vep script to filter the VEP results according to your custom criteria: http://www.ensembl.org/info/docs/tools/vep/script/vep_filter.html#filter_run

Using the VEP script (http://www.ensembl.org/info/docs/tools/vep/script/index.html), you can specify the RefSeq transcript set, and exclude the predicted transcripts using the --refseq and --exclude_predicted options: http://www.ensembl.org/info/docs/tools/vep/script/vep_other.html#refseq

Including the --hgvs option will mean that the HGVS notation is returned in the context of the RefSeq transcripts (excluding the predicted transcripts).

I hope this helps.

Best wishes

Ben Ensembl Helpdesk

ADD COMMENT • link 6.7 years ago by Ben Moore ★ 2.4k

0

Entering edit mode

Thanks Ben. The current command that I am using is as follows

./vep --cache --dir_cache /Software/ensembl-vep/.vep  --stats_text S39_Run3.html --refseq --hgvs --fork 4 -tab --custom /Software/ensembl-vep/.vep/score.bed.gz,score,bed,exact,0 --custom /Software/ensembl-vep/.vep/EX.bed.gz,EX,bed,exact,0 --pick_allele_gene --exclude_predicted --port 3337 -i S39_Run3.recode.vcf -o S39_Run3.txt

The output I get still includes nucleotide (NR_) annotations. Can this be excluded as well ? Also, is there a way to annotate the zygosity of each variant in the output ?

Thank you.

ADD REPLY • link 6.7 years ago by NB ▴ 960

2

Entering edit mode

No problem- very happy to help.

You could do this using the --transcript_filter option, which uses similar notation and formatting as the filter_vep.

For adding the zygosity, you can use the --individual option, but this only works with VCF files containing individual genotype data: https://www.ensembl.org/info/docs/tools/vep/script/vep_options.html#output

Best wishes

Ben Ensembl Helpdesk

ADD REPLY • link 6.7 years ago by Ben Moore ★ 2.4k

0

Entering edit mode

Thanks Ben. I will give this a go.

ADD REPLY • link 6.7 years ago by NB ▴ 960

0

Entering edit mode

Hi Ben, I've got VEP working to the desired output. Just one last question. I would like to include only those variants that are <1% in gnomad_NFE. Is there an option in vep or do I need to use filter_vep ?

Thank you.

ADD REPLY • link 6.7 years ago by NB ▴ 960