Question: Batch Search Effect Of A Snp On Protein (Conserved) Domain
gravatar for Stephan
9.5 years ago by
Stephan150 wrote:

I have a list of nonsynonymous SNV's

I would like to batch search them all to view if any of the variations is in an conserved domain/motif/active site. all databases I have found don't like position input / mutation position.

for each variant i have chromosomal position , gene name and amino acid change in gene like this:



Is there a way to batch search a list of mutations? against a protein domain database?

domain annotation protein • 3.3k views
ADD COMMENTlink modified 9.5 years ago by Andrea_Bio2.6k • written 9.5 years ago by Stephan150
gravatar for Lars Juhl Jensen
9.5 years ago by
Copenhagen, Denmark
Lars Juhl Jensen11k wrote:

I am not aware of any resource that can do that analysis for you.

What I would do is:

  1. Use the Ensembl BioMart to fetch the amino acid sequence for the protein product of each of your Ensembl transcripts.
  2. Map protein domains onto these by submitting the sequences to InterPro, Pfam, and/or SMART.
  3. Write a small script to check if the amino acid changes caused by the SNVs fall inside predicted protein domains.
ADD COMMENTlink written 9.5 years ago by Lars Juhl Jensen11k

I don't like programming :P , but thanks for idea

ADD REPLYlink written 9.4 years ago by Stephan150
gravatar for Bioinfosm
9.5 years ago by
Bioinfosm620 wrote:

Here are some tools that I have used for batch querying

  • polyphen
  • pmut
  • sift

Relevant discussion thread: Algorithms Predicting Effects Of Snps / Aa Substitution On Protein

ADD COMMENTlink modified 10 months ago by RamRS27k • written 9.5 years ago by Bioinfosm620

Please add links for those tools.

ADD REPLYlink written 9.4 years ago by Egon Willighagen5.2k
gravatar for Giovanni M Dall'Olio
9.5 years ago by
London, UK
Giovanni M Dall'Olio27k wrote:

There are some tools available, but I do not know which is the best one for the state of the art.

You can read Ng, Henikoff 2006 for a review, and Burke DF et al 2007 for an example of application of different tools. When we wrote the computational paragraph for the open collaborative paper on interpreting Post-GWAS results, we collected some tools to do the analysis you are asking for: have a look at it.

ADD COMMENTlink written 9.5 years ago by Giovanni M Dall'Olio27k

Thanks nice article lots of tools to play with

ADD REPLYlink written 9.4 years ago by Stephan150
gravatar for Andrea_Bio
9.5 years ago by
Andrea_Bio2.6k wrote:

You can see if the SNP is pathogenic or benign to protein structure using polyphen. To do large batch queries you will need a local installation and it is huge and takes a long time to prepare the databases. I've never used SIFT but i have heard you can get pre computed sift scores so you might be able to do that.

But polyphen and SIFT apply to the overall structure. To see if a SNP affects a particular protein domain i think you might have to what Lars suggested and and write a script. A slight variation on his excellent idea is to use the ensembl perl api and create a sequence slice for each snp, get the transcripts that overlap this slice and then get the domains for each transcript and see if the domain overlaps your snp. I don't the the api well enough to know if you can get the protein domains overlapping the snp directly. But anyhow, you get the point I'm sure.

ADD COMMENTlink written 9.5 years ago by Andrea_Bio2.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1544 users visited in the last hour