Question: Does UniProt have complete SNP information (from dbSNP)?
4.5 years ago by
United States
ajingnk120 wrote:

The recent UniProt website is fabulous, containing a complete collection of links to different resources. UniProt itself could be a very good resource for data mining for biological information.

I have a few questions regarding the SNP information in UniProt.

1. SNP information could be in two different section: Pathology and Biotech section and Sequence section. Is this from the annotation of disease SNP in dbSNP?

2. I want to identify disease SNP from natural variance. From xml file of UniProt (like:, are all "feature" entries of variant with "evident" key disease related? Which is like following line:

<feature type="sequence variant" description="In GLC1E." id="VAR_021537" evidence="16 17 18">

3. How does UniProt do the mapping between dbSNP and UniProt? Just by sequence alignment of mRNA? Is the mapping complete and correct? Or there is any heuristic behind.




Thanks for liking the redesign!

4.5 years ago by
me690 wrote:

1. The disease kind is explained in you can read a bit more on the other kind here

2. This is a bit easier using RDF and sparql than in the XML format. See this example to get all natural variants with a xref and involved in disease entry.

3. Hope one of my other colleagues comes round to answer that ;)

Thanks! The SPARQL query on UniProt seems to be very powerful. Why do you need to define PREFIX rdfs and skos here? Is there any tutorial for SPARQL for UniProt?

Those two PREFIXes are just used a lot in different parts of the UniProt rdf model. This is a general intro to SPARQL using the UniProt/NCBI taxonomy data. But no tutorial currently exists :(

