Question: Sift on the VEP output
0
gravatar for mostafarafiepour
2.2 years ago by
mostafarafiepour100 wrote:

Hi all,

I installed the VEP (Variant Effect Predictor) Program using Anaconda. and output was obtained without error.

Code i run:

vep -i Final.vcf -gff data.gff.gz -fasta genomic.fna

My output:

Location                Allele                 Gene     Feature         Feature_type    Consequence     cDNA_position   CDS_position  Protein_position  Amino_acids   Codons  Existing_variation      Extra

CM009840.1_10757615_A/G CM009840.1:10757615     G       102405271       XM_006078640.2  Transcript      missense_variant        639     607     203     T/A     Act/Gct -       IMPACT=MODERATE;STRAND=1;SOURCE=data.gff.gz
CM009840.1_10757615_A/G CM009840.1:10757615     G       102405271       XM_025278176.1  Transcript      missense_variant        639     607     203     T/A     Act/Gct -       IMPACT=MODERATE;STRAND=1;SOURCE=data.gff.gz
CM009840.1_10757615_A/G CM009840.1:10757615     G       102405271       XM_025278183.1  Transcript      missense_variant        639     607     203     T/A     Act/Gct -       IMPACT=MODERATE;STRAND=1;SOURCE=data.gff.gz

But, the problem is that I did not calculate Sift for me? as a rule, should display the value of Sift in the last column?

Given the script executed and the resulting output, I want to know how to get the Sift value for each missense_variant?

Best Regard

Mostafa

snp missense.variant • 872 views
ADD COMMENTlink modified 2.2 years ago by ATpoint44k • written 2.2 years ago by mostafarafiepour100
1

Please read the manual. You have to set a flag to tell VEP to output SIFT scores.

ADD REPLYlink written 2.2 years ago by ATpoint44k

many thanks for your reply,

I've implemented the script as follows, but I gave the same output.

vep -i Final.vcf -gff data.gff.gz -fasta genomic.fna --sift b
ADD REPLYlink written 2.2 years ago by mostafarafiepour100

Any error or warning messages? What is the output of

vep -i Final.vcf -gff data.gff.gz -fasta genomic.fna --sift b | grep -v '^#' | head
ADD REPLYlink written 2.2 years ago by ATpoint44k

Error does not just WARNING.

WARNING: Parent entries with the following IDs were not found or skipped due to invalid types: rna27858, rna27857
WARNING: Parent entries with the following IDs were not found or skipped due to invalid types: rna40648
WARNING: Parent entries with the following IDs were not found or skipped due to invalid types: rna46030, rna46031
WARNING: Parent entries with the following IDs were not found or skipped due to invalid types: rna47129, rna47130
WARNING: Parent entries with the following IDs were not found or skipped due to invalid types: rna50084
ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by mostafarafiepour100

Sorry I did not understand, what this script is about?

ADD REPLYlink written 2.2 years ago by mostafarafiepour100

I run this script but again give the same output and do not count the Sift for any missense_variant?

ADD REPLYlink written 2.2 years ago by mostafarafiepour100

Please do not paste text as screenshots - copy the text as either paste it with code formatting or paste in a github gist and use the link here.

ADD REPLYlink written 2.2 years ago by _r_am32k
1
gravatar for ATpoint
2.2 years ago by
ATpoint44k
ATpoint44k wrote:

As recently indicated in your repost of this question, you are working on the Buffalo genome (Bubalus bubalis). You should have stated this right away, because it is not in the list of current Ensembl genomes. Therefore, it is no surprise that the sift is not calculated, as the annotation for this species does not exist at Ensembl (correct my if I am wrong). I am not well-familiar with SIFT at all, but there are two things you have to do. First, use the correct command line. In your new question, you are again missing the --sift parameter. Second, as annotations are missing at Enselbl, yu have to find out if it is possible to obtain custom sift scores for your species and use them with VEP. I suggest you contact the Ensembl Helpdesk with this problem, providing as much information as you can in your email. I will close the new question of yours, to keep updates focused in this thread here. Please modify the question here by including the information on the species. There is no point in opening multiple questions on the exact same topic. It only spreads information to different threads and potentially annoys users who already contributed with help in the first place.

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by ATpoint44k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1484 users visited in the last hour
_