Batch Search Effect Of A Snp On Protein (Conserved) Domain
4
1
Entering edit mode
10.5 years ago
Stephan ▴ 150

I have a list of nonsynonymous SNV's

I would like to batch search them all to view if any of the variations is in an conserved domain/motif/active site. all databases I have found don't like position input / mutation position.

for each variant i have chromosomal position , gene name and amino acid change in gene like this:

chr5:112227650,112227650:T/A

ZRSR1:ENST00000391338:exon1:c.T314A:p.L105Q,

Is there a way to batch search a list of mutations? against a protein domain database?

protein domain annotation • 3.6k views
ADD COMMENT
3
Entering edit mode
10.5 years ago

I am not aware of any resource that can do that analysis for you.

What I would do is:

  1. Use the Ensembl BioMart to fetch the amino acid sequence for the protein product of each of your Ensembl transcripts.
  2. Map protein domains onto these by submitting the sequences to InterPro, Pfam, and/or SMART.
  3. Write a small script to check if the amino acid changes caused by the SNVs fall inside predicted protein domains.
ADD COMMENT
0
Entering edit mode

I don't like programming :P , but thanks for idea

ADD REPLY
3
Entering edit mode
10.5 years ago
Bioinfosm ▴ 620

Here are some tools that I have used for batch querying

  • polyphen
  • pmut
  • sift

Relevant discussion thread: Algorithms Predicting Effects Of Snps / Aa Substitution On Protein

ADD COMMENT
0
Entering edit mode

Please add links for those tools.

ADD REPLY
1
Entering edit mode
10.5 years ago

There are some tools available, but I do not know which is the best one for the state of the art.

You can read Ng, Henikoff 2006 http://www.ncbi.nlm.nih.gov/pubmed/16824020 for a review, and Burke DF et al 2007 for an example of application of different tools. When we wrote the computational paragraph for the open collaborative paper on interpreting Post-GWAS results, we collected some tools to do the analysis you are asking for: have a look at it.

ADD COMMENT
0
Entering edit mode

Thanks nice article lots of tools to play with

ADD REPLY
0
Entering edit mode
10.5 years ago
Andrea_Bio ★ 2.7k

You can see if the SNP is pathogenic or benign to protein structure using polyphen. To do large batch queries you will need a local installation and it is huge and takes a long time to prepare the databases. I've never used SIFT but i have heard you can get pre computed sift scores so you might be able to do that.

But polyphen and SIFT apply to the overall structure. To see if a SNP affects a particular protein domain i think you might have to what Lars suggested and and write a script. A slight variation on his excellent idea is to use the ensembl perl api and create a sequence slice for each snp, get the transcripts that overlap this slice and then get the domains for each transcript and see if the domain overlaps your snp. I don't the the api well enough to know if you can get the protein domains overlapping the snp directly. But anyhow, you get the point I'm sure.

ADD COMMENT

Login before adding your answer.

Traffic: 1893 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6