Human Variation Databases
8
10
Entering edit mode
10.7 years ago
Interact ▴ 100

Hello

I was hoping to find someone with some experience of human variation databases.

How long will it take for the SNPs discovered in the 1000 genome project to make their way into public databases like dbSNP?

Which human variation database has the best coverage of SNPs. Will all of the SNPs in dbSNP be covered in HGMD and vice versa?

Which database is the 'best' if you are interested in investigating SNPs in a clinical context?

Are there any pros/cons of the different databases (dnSNP/HGVbase/HGMD)

Thank you

snp sv database genome • 4.7k views
ADD COMMENT
8
Entering edit mode
10.7 years ago

dbSNP132 includes data from 1000 Genomes project pilot 1, 2, and 3 studies. ( http://www.ncbi.nlm.nih.gov/mailman/pipermail/dbsnp-announce/2010q4/000097.html )

for the difference between dbSNP and HGMD, see this previous question: What Is The Difference Between A Snp And An Entry In A Mutation Database?

ADD COMMENT
8
Entering edit mode
10.7 years ago

I would like to answer two specific aspects of your question:

Which database is the 'best' if you are interested in investigating SNPs in a clinical context?

If you are looking for SNPs with clinical relevance you could check dbGAP]1 and PharmGKB. dbGAP provides results from genome-wide association studies where as PharmGKB provides candidate gene / genome wide studies relevant to pharmacogenomic variants.

dbGAP is a difficult resource to explore due restriction on access of phenotypes / traits and related data. You can get access to GWAS results (significant SNPs, P-value, OR) via HUGENavigator, except the clinical / raw data.

Which human variation database has the best coverage of SNPs. Will all of the SNPs in dbSNP be covered in HGMD and vice versa?

I am not sure about the coverage, but you can check Ensembl Variation, please take a look at the recent paper that explains the features of Ensembl Variation resources for more details on the features of this resource. If you are interested in annotation of the SNPs, You may also try the other variation based annotation databases like Varietas or SCANDB.

ADD COMMENT
0
Entering edit mode

Khader, I am putting some of the resources you put in this answer in the article at WikiGenes. I will cite this answer from Biostar.

ADD REPLY
0
Entering edit mode

@Giovanni: Are you directly editing the article or adding it to the discussion page?

ADD REPLY
7
Entering edit mode
10.7 years ago

Question: Which human variation database has the best coverage of SNPs.

Tough to answer because what do you mean? If you're looking across the human genome, then mine dbSNP. If you're interested in something more specific, say variants in CYP (P450) genes or mitochondrial genome differences, then you're best served by specialised databases.

The SNPs in clinical setting question is really hard, in my mind, because this is evolving quite rapidly. Do you want the 200,000 or so SNPs that 23andMe, for example, adds to their chip because there is evidence for an association of some type? For those SNPs, is premature gray hair or detecting asparagus byproducts in urine really relevant? Do you want those SNPs that are in OMIM because they have been found in medical cases? Do you want those that are routinely tested for in terms of metabolic health of newborns or pre-pregnancy counseling? Or do you want to think about the loads of new variants found from sequencing cancer genomes, especially the SNPs that can "tag" a copy number variant? This is not a pool of SNPs, but an amorphous cloud - boundaries are not well defined.

I know this is not an answer, per se. I don't have one because I don't need such SNPs for my research. These are just some thoughts I'd pose to my colleagues before we tried to capture such information. Good luck!

ADD COMMENT
4
Entering edit mode
10.7 years ago
Biomed 4.8k

The SNVs discovered in the 1000Genomes Project are routinely added to the dbSNP database. As Pierre mentioned dbSNP132 is the most recent version and has most SNPs annotated with 1000Genomes data as well as a lot of mostly lower frequency SNPs added to the database through 1000Genomes data. I highly recommend everyone to use this dataset.

dbSNP has the best covarage in general but if you are interested in a specific disease there are a lot of locus specific databases out there that have more variation data on a specific region/disease etc.

Clinical context is a hard one but I recommend dbSNP as a first pass filter and to dig down locus specific databases. HGMD is very good (it has disease annotations and literature links) but is not error free. So be careful with it as well.

I strongly discourage the use of dbSNP130 and below for any clinical correlation unless you are looking for very common snps or SNPs with poor validation and frequency metadata (one off submissions from a single sequence etc.)

ADD COMMENT
0
Entering edit mode

Any guess about the false-positive rate in HGMD?

ADD REPLY
0
Entering edit mode

I don't know any published numbers but there are certainly cases (although not too many) where the gene/variant is mentioned in a paper so it ends up in hgmd but when you read the paper you see that it is a negative finding. So in my opinion HGMD is a useful tool for a first pass survey but the outcome of that requires "human eye" before solid conclusions.

ADD REPLY
4
Entering edit mode
ADD COMMENT
1
Entering edit mode
10.7 years ago

EuroGentest may be a valuable and informative place to look. EuroGentest is a European initiative that is dealing with all aspects of genetic testing - Quality Management, Information Databases, Public Health, New Technologies and Education. Here is a direct link to the Information Databases page.

ADD COMMENT
0
Entering edit mode
9.5 years ago

The post 'How to search disease-causing chromosomal structure variation?' has some useful resources listed, including an updated review article along the lines suggested by 'Giovanni':

Sneddon TP, Church DM. Online resources for genomic structural variation. Methods Mol Biol. 2012;838:273-89. PubMed PMID: 22228017.

ADD COMMENT
0
Entering edit mode
8.1 years ago

I don't know think you need to pick a single database.

For example, ANNOVAR is a popular tool for variant annotation. The basic report includes 1000 genome, ESP, and dbSNP annotations (as well as functional predictions for coding variants). It also allows you to search a number of additional databases (I usually use the basic report as well as annotations from the GWAS catalog).

There is also a web-based version of ANNOVAR (wANNOVAR), but I think the local installation may provide a greater range of functionality.

SeattleSNP is another popular tool (although I personally like ANNOVAR a little better).

If you are studying your own personal genome, you can also check out Promethease (although I wouldn't consider that a standard practice for scientific publications). It uses SNPedia for annotations.

ADD COMMENT

Login before adding your answer.

Traffic: 1093 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6