Question: How to get input for provean from vcf files annotated by snpeff?
gravatar for shl198
6.1 years ago by
United States
shl198410 wrote:

Hi all,

I called variants using GATK, and annotated the results using snpeff. Since the organism is chinese hamster, there is not much information available. I want to use provean to predict the effects of variants. The problem is that the input for provean for non human/mouse model should be amino acids sequence and amino acids variants. What I have now are variants in genomic level, does anyone know any tool that can transfer the genomic vcf files to input files for provean? Thanks.

snpeff provean vcf • 2.7k views
ADD COMMENTlink modified 10 weeks ago by predeus1.4k • written 6.1 years ago by shl198410

Did snpEff list the coding variants? You can use that to filter the VCF and get only the relevant variants (as the first step)

ADD REPLYlink written 6.1 years ago by _r_am32k

Thanks. Yes, there is coding variants, but provean also needs the whole protein sequence, After building the database using snpeff, there is only one .bin file, do you know how to get the whole protein sequence? The only way I can think of is using genome annotation file, but it would be a little bit tricky.

ADD REPLYlink written 6.1 years ago by shl198410

Standalone PROVEAN? I've only used Web based PROVEAN. If you know the start codon location, translation shouldn't be a huge problem. I can't recollect any tool that gives you protein mutations from nucleotide changes, sorry :-(

ADD REPLYlink written 6.1 years ago by _r_am32k

Hi @shl198 I have the same issue did you find a solution

ADD REPLYlink written 4 months ago by kmkdesilva90

Do not add answers if you're not answering the top level question. Use Comments instead.

ADD REPLYlink written 4 months ago by _r_am32k
gravatar for predeus
10 weeks ago by
predeus1.4k wrote:

Unfortunately, it's not easy, because this requires mutation effect predictors to work in terminal, and most of those are old and work poorly. I've spent some time trying to get Provean to work, for example, but without any luck.

One tool that works brilliantly (and is well supported) is SIFT4G, a re-implementation of older SIFT. You can get all the instructions here:

You'll need to work out the calibration (cutoffs) since databases change all the time.

ADD COMMENTlink written 10 weeks ago by predeus1.4k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2548 users visited in the last hour