Relationship Between Ensembl And Dbsnp Builds, And Dbsnp Builds In General
3
4
Entering edit mode
13.9 years ago
Andrea_Bio ★ 2.8k

Venerable bioinformaticians,

Please can you help me.

How do i find out which version of dbSNP is used by a particular version of ensembl.

For example I am looking at an older verison of ensembl(58) and I want to know which database build of dbsnp it has imported variations from. All of the snps i can see in e!58 say they are from dbsnp 130 but i would like to know where i can find the mapping/association definitely.

I have also been unable to find a list of the dates of the different dbSNP releases. I found this page but nothing else http://www.ncbi.nlm.nih.gov/books/NBK44478/

Also on this page, what does it mean that bos taurus is using current build 131 http://www.ncbi.nlm.nih.gov/SNP/snp_summary.cgi It also says it has 2,446,173 new snps when it only has 2210641 in total. I presume some of those submissions refer to more data for known snps then? I don't understand how the cow data has new submissions but is only an older build number. Wouldn't it be rebuilt (whatever that means) to include the new submissions? I would really appreciate an explanation of how the organism build number applies to the database build number and then how the number of new submissions applies to that.

Given the dbsnp version information from ensembl i would then like to query the relevant dbsnp build of dbsnp using eutils but i dont know exactly what the limits on this page mean:

http://www.ncbi.nlm.nih.gov/snp

The obvious candiates are create build id and update build id but i don't know the difference between a create build and an update build id. I presume it is the build in which the snp is created and the build in which the snp is updated from the names :) But i tredi this "bos taurus"[Organism] AND ("132"[CBID] OR "131"[CBID]) and got no results.

My main concern is that i don't understand the different build numbers for different organisms

thanks a lot for your help

dbsnp • 4.5k views
ADD COMMENT
6
Entering edit mode
13.9 years ago

You can get the mapping snp/version in ensembl via the mysql database for ensembl:

mysql> use bos_taurus_variation_58_4g;
mysql> select V.name,S.name,S.version from variation as V, source as S where V.source_id=S.source_id  limit 10;
+-----------+-------+---------+
| name      | name  | version |
+-----------+-------+---------+
| rs8193041 | dbSNP |     130 |
| rs8193042 | dbSNP |     130 |
| rs8193043 | dbSNP |     130 |
| rs8193044 | dbSNP |     130 |
| rs8193045 | dbSNP |     130 |
| rs8193046 | dbSNP |     130 |
| rs8193047 | dbSNP |     130 |
| rs8193048 | dbSNP |     130 |
| rs8193049 | dbSNP |     130 |
| rs8193050 | dbSNP |     130 |
+-----------+-------+---------+
10 rows in set (0.16 sec)

mysql> use canis_familiaris_variation_48_2f;
mysql> select V.name,S.name,S.version from variation as V, source as S where V.source_id=S.source_id  limit 10;
+-----------+-------+---------+
| name      | name  | version |
+-----------+-------+---------+
| rs8281491 | dbSNP |     126 |
| rs8281492 | dbSNP |     126 |
| rs8281493 | dbSNP |     126 |
| rs8281494 | dbSNP |     126 |
| rs8281495 | dbSNP |     126 |
| rs8281496 | dbSNP |     126 |
| rs8281497 | dbSNP |     126 |
| rs8281498 | dbSNP |     126 |
| rs8281499 | dbSNP |     126 |
| rs8281500 | dbSNP |     126 |
+-----------+-------+---------+
10 rows in set (0.34 sec)
ADD COMMENT
3
Entering edit mode
13.9 years ago

I know it's not straight forward, but this is the only way I've found to be relatively quick to look for the dbSNP version used on a particular Ensembl version: note that you always have a "Biomart" link on the top right of the Ensembl webpage, so go for it, select the "ensembl variation" database, and you will see the dnSNP version in the dataset combo (also note that it does depend on the organism - Ensembl 58 uses dbSNP130 for bos taurus and humans, but dbSNP126 instead for canis familiaris).

although Ensembl does try to keep an archive of their different versions, I've been told in the past that dbSNP was not doing so. I don't know if they may have changed such policy, but trying to address all you queries I should say that the only practical way I've found to query past versions of dbSNP is indeed using Ensembl's archive.

ADD COMMENT
1
Entering edit mode

I couldn't find an obvious release note mapping either, and went straight to Biomart too for the answer.

ADD REPLY
1
Entering edit mode
13.9 years ago
Andrea_Bio ★ 2.8k

thanks for your answers but what do you think about the build numbers?

here http://www.ncbi.nlm.nih.gov/SNP/snp_summary.cgi that bos taurus is using current build 131. It also says it has 2,446,173 new snps when it only has 2210641 in total for cow. How can this be? I presume some of these 'new' snps must be corroborating submissions for known snps. Is this correct?

Why is cow on build 131 whereas the latest build is 132? Why hasn't cow been rebuilt when it has millions of new submissions? Or does this mean that the cow database has been rebuilt and this is the 131st time it has been rebuilt. Perhaps the oldest dbsnp 'division' has been rebuilt 132 times and there was nothing for cow in the first build.


edit

I don't think i am right. if you do a query against dbSNP to get all of the snps new in build 131 you get an empty set, same if you query dnSNP to get all of the snps updated in build 131. So that seems to rule out my idea that the 'new' snps in build 131 are updates to existing snps

so i have no idea what the build numbers mean

ADD COMMENT

Login before adding your answer.

Traffic: 2215 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6