Question

Does UniProt have complete SNP information (from dbSNP)?

2

Entering edit mode

10.8 years ago

ajingnk ▴ 130

The recent UniProt website is fabulous, containing a complete collection of links to different resources. UniProt itself could be a very good resource for data mining for biological information.

I have a few questions regarding the SNP information in UniProt.

SNP information could be in two different section: Pathology and Biotech section and Sequence section. Is this from the annotation of disease SNP in dbSNP?
I want to identify disease SNP from natural variance. From xml file of UniProt (like: http://www.uniprot.org/uniprot/Q96CV9.xml), are all "feature" entries of variant with "evident" key disease related? Which is like following line: <feature type="sequence variant" description="In GLC1E." id="VAR_021537" evidence="16 17 18">
How does UniProt do the mapping between dbSNP and UniProt? Just by sequence alignment of mRNA? Is the mapping complete and correct? Or there is any heuristic behind.

Thanks

snp • 3.4k views

ADD COMMENT • link updated 3.5 years ago by Ram 45k • written 10.8 years ago by ajingnk ▴ 130

0

Entering edit mode

Thanks for liking the redesign!

ADD REPLY • link 10.8 years ago by me ▴ 760

Ram · Accepted Answer · 2014-10-03

2

Entering edit mode

10.8 years ago

me ▴ 760

The disease kind is explained in http://onlinelibrary.wiley.com/doi/10.1002/humu.22594/full you can read a bit more on the other kind here
This is a bit easier using RDF and sparql http://www.uniprot.org/uniprot/Q96CV9.rdf than in the XML format. See this example to get all natural variants with a xref and involved in disease entry.
Hope one of my other colleagues comes round to answer that ;)

ADD COMMENT • link updated 3.5 years ago by Ram 45k • written 10.8 years ago by me ▴ 760

0

Entering edit mode

Thanks! The SPARQL query on UniProt seems to be very powerful. Why do you need to define PREFIX rdfs and skos here? Is there any tutorial for SPARQL for UniProt?

ADD REPLY • link 10.7 years ago by ajingnk ▴ 130

0

Entering edit mode

Those two PREFIXes are just used a lot in different parts of the UniProt rdf model. This is a general intro to SPARQL using the UniProt/NCBI taxonomy data. But no tutorial currently exists :(

ADD REPLY • link 10.6 years ago by me ▴ 760