Question

Dbsnp: Mappings To Protein Sequence?

4

Entering edit mode

13.5 years ago

Chris ★ 1.6k

Hey,

we are trying to get a local sub-part of dbSNP running on our servers here in our group. Since we are only interested in nsSNPs, we are specifically interested in mappings of rs# to protein sequence, i.e. the concrete RefSeq identifier, the sequence position and the mutant residue. Following the dbSNP handbook from NCBI it seems that the organism-specific SNPContigLocusId tables are of major interest and indeed they have everything that we need. However, those tables only exist for 14 organisms out of overall 100. Does that mean that for the huge majority there don't exist these mappings to protein sequences? If so, why? Or could this information be stored somewhere else in the huge space of dbSNP tables?

Thanks for sharing any insights, Chris

dbsnp mapping protein snp • 3.5k views

ADD COMMENT • link updated 8.1 years ago by khulood445592 • 0 • written 13.5 years ago by Chris ★ 1.6k

1

Entering edit mode

Are you interested in SNPs from all organisms or limited to a subset ? Such mappings are available in various nsSNP annotation database for human, not sure about other organisms.

ADD REPLY • link 13.5 years ago by Khader Shameer 18k

0

Entering edit mode

I'm interested in nsSNPs from all organisms that show up in dbSNP. Human is among the 14 organisms that have the mappings. Thanks, Chris

ADD REPLY • link 13.5 years ago by Chris ★ 1.6k

0

Entering edit mode

Hi Chris,

How is your mapping from nsSNP to protein sequence? I am working on a similar project right now. Do you find why only limited mapping from nsSNP to protein sequence?

ADD REPLY • link 9.6 years ago by ajingnk ▴ 130

score 1 · Answer 1 · 2010-11-05

1

Entering edit mode

13.5 years ago

Jan Kosinski ★ 1.6k

In my group, a server has just been developed that does more or less the thing you want (if I understood correctly your question ;-).

http://www.biocomputing.it/picmi/

You can try with Nucleotide input option, see Help for input description.

However, in output you would get the the sequence position and the mutant residue but not on RefSeq but Ensemble transcript. Ensemble transcript do have links to RefSeq, but I don't know how to retrieve them automatically for highthrouput input.

Give it a try, and contact authors if you need more.

ADD COMMENT • link 13.5 years ago by Jan Kosinski ★ 1.6k

0

Entering edit mode

Thanks Jan, I'll give it a try. However I'd really like to know, why dbSNP only has these mappings to 14 organisms. There must be a reason for that. Chris

ADD REPLY • link 13.5 years ago by Chris ★ 1.6k

score 0 · Answer 2 · 2012-01-17

0

Entering edit mode

12.3 years ago

User 6318 • 0

Hi, Chris! In my group, we are currently trying to build a human protein variant database generated from nsSNPs. We need to store both the amino acid sequence of protein variant and original protein. But I can only find protein_acc, residue for the SNP allele and position, but not the protein sequence in SNPContigLocusId tables. Where can I find and download all human protein variant sequence mapped from nsSNPs?

ADD COMMENT • link 12.3 years ago by User 6318 • 0

0

Entering edit mode

Hi, the fields protein_acc and protein_ver are pointers to RefSeq. To get the corresponding sequences go to their ftp server and download [1] the fasta file that contains all human sequences. This normally does not contain all sequences that are being referenced in dbSNP. In those cases you have to download those at NCBI case by case, e.g. by using Entrez.

[1] ftp://ftp.ncbi.nih.gov/refseq/H_sapiens/H_sapiens/protein/protein.fa.gz

ADD REPLY • link 12.3 years ago by Chris ★ 1.6k

score 0 · Answer 3 · 2016-03-26

0

Entering edit mode

8.1 years ago

khulood445592 • 0

hi I have question in bioinformatics I have gen which is IL8 and this has mutation TGC>TGG how I could find it if the mutation in codon 36

ADD COMMENT • link 8.1 years ago by khulood445592 • 0