Question: Vep Not Giving Annotation With Refseq Transcript
0
gravatar for Vivek
6.1 years ago by
Vivek2.4k
Denmark
Vivek2.4k wrote:

I'm trying to annotate variants with the ensembl Variant Effect Predictor and I'd prefer to get my variant annotation done with refseq transcripts but VEP does not seem to be doing this even when I explicitly mention this in the options.

For example the variant

1    100672060    rs12021720    T    C

Clearly lies in the third exon of this refseq transcript:

2    NM_001918    chr1    -    100652477    100715409    100661810    100715376    11    100652477,100671785,100672000,100676249,100680372,100681538,100684181,100696288,100700991,100706316,100715325,    100661978,100671857,100672192,100676327,100680539,100681755,100684303,100696470,100701067,100706440,100715409,    0    DBT    cmpl    cmpl    0,0,0,0,1,0,1,2,1,0,0,

Yet VEP seems intent on giving me annotation only in terms of CCDS & Ensembl transcripts.

Uploaded Variation    Location    Allele    Gene    Feature    Feature type    Consequence    Position in cDNA    Position in CDS    Position in protein    Amino acid change    Codon change    Co-located Variation    Extra
1_100672060_T/C    1:100672060    C    CCDS767.1    CCDS767.1    Transcript    missense_variant    1150    1150    384    S/G    Agt/Ggt    -    -
1_100672060_T/C    1:100672060    C    ENSESTG00000014700    ENSESTT00000036776    Transcript    intron_variant    -    -    -    -    -    -    -
1_100672060_T/C    1:100672060    C    ENSESTG00000014716    ENSESTT00000036811    Transcript    missense_variant    160    109    37    S/G    Agt/Ggt    -    -

Any ideas on why this happens? I tried both the command line and online versions and I get the same result.

Command:

perl path_to_vep/vep/variant_effect_predictor.pl -i PathTo.vcf --offline --dir_cache Path_to_Vep/vep/ --fasta Path_to_Vep/vep/homo_sapiens/74/Homo_sapiens.GRCh37.74.dna.primary_assembly.fa -o vep_annotation.txt --hgvs --force_overwrite --sift b --polyphen b --maf_1kg --maf_esp --refseq --gmaf --protein --ccds
annotation refseq ensembl • 3.9k views
ADD COMMENTlink modified 6.1 years ago by EnsemblWill10 • written 6.1 years ago by Vivek2.4k

Can you paste the command here?

ADD REPLYlink written 6.1 years ago by Ashutosh Pandey12k

I added the command but it will likely not be of help because VEP annotations depend on which version of the annotation database you download. I used the refseq based database during installation. The results are reproducible from the online vep tool.

ADD REPLYlink written 6.1 years ago by Vivek2.4k

I thought you may be missing the "--refseq" flag. BTW, I found this post that can be relevant to your problem (http://lists.ensembl.org/pipermail/dev/2013-February/008400.html)

ADD REPLYlink written 6.1 years ago by Ashutosh Pandey12k

Thanks for the link, I read through that in my google searches and that's how I ended up including the --refseq flag. That still doesn't rectify my issue since I'm using the proper database. I decided to post here since I've seen someone from Ensembl respond to these questions on this forum but I'll likely have to e-mail the list serve to get a solution. VEP is a pretty comprehensive tool that gives me a lot of my annotation information except for this issue.

ADD REPLYlink modified 6.1 years ago • written 6.1 years ago by Vivek2.4k
3

If nobody replies you can try contacting Emily (Emily_Ensembl) about the same. She should be able to figure out what is going wrong.

ADD REPLYlink written 6.1 years ago by Ashutosh Pandey12k
1

Clearly I have a reputation here. I'm going to pass this onto Will, who made and maintains the VEP, to see what he says.

ADD REPLYlink written 6.1 years ago by Emily_Ensembl20k
1
gravatar for EnsemblWill
6.1 years ago by
EnsemblWill10
EnsemblWill10 wrote:

It looks like the Ensembl otherfeatures database (from which the RefSeq cache is built) is missing this transcript.

> mysql -h ensembldb.ensembl.org -u anonymous
...
mysql> use homo_sapiens_otherfeatures_74_37
mysql> select stable_id from transcript where stable_id like 'NM_00191%';
+-------------+
| stable_id   |
+-------------+
| NM_001910.3 |
| NM_001911.2 |
| NM_001912.4 |
| NM_001913.3 |
| NM_001914.3 |
| NM_001915.3 |
| NM_001916.3 |
| NM_001917.4 |
| NM_001919.3 |
+-------------+
9 rows in set (0.14 sec)

I'm not sure why this would be, but I can pass this on to our genebuilders who may be able to help.

ADD COMMENTlink written 6.1 years ago by EnsemblWill10
0
gravatar for Laura
6.1 years ago by
Laura1.7k
Cambridge UK
Laura1.7k wrote:

Are you using the refseq cache file?

You need

ftp://ftp.ensembl.org/pub/release-74/variation/VEP/homo_sapiens_refseq_vep_74.tar.gz

rather than

ftp://ftp.ensembl.org/pub/release-74/variation/VEP/homo_sapiens_vep_74.tar.gz

ADD COMMENTlink written 6.1 years ago by Laura1.7k

Yes I'm using the right file, like I mentioned in the post you can reproduce the same result on the online VEP tool by selecting the refseq option.

ADD REPLYlink written 6.1 years ago by Vivek2.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2275 users visited in the last hour