VEP is returning ClinVar fileds bank (-- custom annotation with VEP)
0
0
Entering edit mode
3 months ago

Hi

I am trying to use VEP to annotate a vcf file and also add ClinVar data to it:

vep --dir_cache \$VEP_CACHEDIR
-i path/to/input/input.vcf --format vcf
-o path/to/output/output.vcf --vcf
--cache path/to/cachedir/
--offline --assembly GRCh37
--custom path/to/clinvar/clinvar.vcf.gz,ClinVar,vcf,exact,0,CLNSIG,CLNREVSTAT
--everything --force_overwrite --fork 16


Although CLNSIG and CLNREVSTAT columns are being added to the vcf file. But all cells are blank (more than 90% of my variants have been previously reported in the ClinVar)

ClinVar VEP annotation • 622 views
2
Entering edit mode

Is that your exact command as you used it? Or did you spell the word assembly correctly in your real command? If the incorrect spelling was used, the VEP may be annotating against GRCh38.

0
Entering edit mode

Thanks, @Emily_Ensembel for the comment. I have spelled it correctly in the real command. I will edit my post to make it correct as well.

1
Entering edit mode

Can you give an example of a line where you expect to get ClinVar annotation, please? And can you specify exactly which file you downloaded from ClinVar?

0
Entering edit mode

It is a population-level vcf file that contains about ~40000 variants. I downloaded https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37/clinvar.vcf.gz and also https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37/clinvar.vcf.gz.tbi

0
Entering edit mode

0
Entering edit mode

I realized that the ClinVar data actually are added to the CSQ column, however the "vcfR2tidy" function (I need to load into R for downstream analysis) can't retrieve these data correctly. It will add the ClinVar and ClinVar_CLNSIG columns but no information will be popped up in cells. However, Clinvar annotation data could be found in the CSQ cell.

0
Entering edit mode

I have also two more questions and would appreciate your guidance.

1. I tried to download all FASTA files (http://m.ensembl.org/info/data/ftp/index.html) to be able to do --everything offline. However, all files are based on GRCh38 assembly like this: http://ftp.ensembl.org/pub/release-104/fasta/homo_sapiens/pep/ Could you please tell me how I can download GRCh37 FASTA files?

2. I need to use some plugins in offline mode, specially REVEL, and MaxEntScan. However, I could not find an example script and details on how to add these two to my script.

0
Entering edit mode

You should get the option to install the FASTA file when you run install.pl.

Most of the pathogenicity scores are part of the dbNSFP plugin.

0
Entering edit mode

If I am not mistaken, REVEL and MaxEntScan are not part of dbNSFP and should be run separately.

0
Entering edit mode

You're right, you'll find the separate REVEL and MaxEntScan plugins on the plugins page.