Question: Identifying Snps Contributed By 1000Genomes Project
0
gravatar for Krisr
6.5 years ago by
Krisr430
United States
Krisr430 wrote:

Hello,

I was wondering if anyone might know how to determine and retrieve a list of SNPs that have been contributed to the latest dbSNP132 build from the 1000 Genomes Project?

Thanks!

genome snp • 1.7k views
ADD COMMENTlink written 6.5 years ago by Krisr430

Possibly answered at this related question: http://biostar.stackexchange.com/questions/3432/1000g-and-dbsnp-build-132-in-ucsc-genome-browser.

ADD REPLYlink written 6.5 years ago by Neilfws47k
1
gravatar for Thomas
6.5 years ago by
Thomas720
Copenhagen, DK
Thomas720 wrote:

Many people at biostar recommend ANNOVAR annotation (http://www.openbioinformatics.org/annovar/) program...

Here you can also download a list of SNPs identified from the 1000G... even for different builds of dbSNPs with the command: annotate_variation.pl -downdb 1000g2010 humandb/

Best Thomas

ADD COMMENTlink written 6.5 years ago by Thomas720
1

good point... Thanks. But still, I guess you can exclude the ones without the rs numbers?

ADD REPLYlink written 6.5 years ago by Thomas720

the ANNOVAR tables for 1000genomes contain all 1000genomes data, not only the data that was submitted to dbSNP, which I think is what KirsR was asking for.

ADD REPLYlink written 6.5 years ago by Jorge Amigo10k

now that sounds like a good idea. I would still look for the data in the proper repository (ie dbSNP) just for quality/security concerns, but processing ANNOVAR's 1000genomes tables (you would have to decide which releases are you interested in) and filtering by non-rs-code-presence should definitely do the job.

ADD REPLYlink written 6.5 years ago by Jorge Amigo10k

Thanks everyone. I used the tables feature in the USC genome browser. There you can select ALL snps from dbSNP and select only those SNPs that where submitted by the 1000GENOMES -- by specifying them as a filter for "submitter"

This gave me ~ 15 million SNPs contributed from 1000GENOMES, specifically

ADD REPLYlink written 6.5 years ago by Krisr430

Sorry to revive an old thread, but what settings do you use in the USC genome browser to get the 15m SNPs? When I use the entire genome and the 1000G set, I get an overflow error. 

ADD REPLYlink written 3.6 years ago by goodcow10
1
gravatar for Jorge Amigo
6.5 years ago by
Jorge Amigo10k
Santiago de Compostela, Spain
Jorge Amigo10k wrote:

if you go to the dbSNP summary page, you will be able to see a table with the description of the current data. you will find in it a column with the new submissions numbers, and if you follow the link of the new data for homo sapiens then you'll find out all the submissions made for dbSNP132 sorted by submitter's batches. there are 4 1000genomes batches there, and if you follow their links you will be able to download those batch submissions directly:

unfortunately, these numbers just mean that they WERE submitted by 1000genomes, but not that no one else did, so in case you are looking for SNPs that ONLY 1000genomes project reported I guess you'll have to crosscheck all these results.

ADD COMMENTlink written 6.5 years ago by Jorge Amigo10k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 510 users visited in the last hour