Question: Which One Should I Use Hapmap Or 1000Genome Or Dbsnp?
2
gravatar for Madhan
7.1 years ago by
Madhan220
United States
Madhan220 wrote:

Hi,

I have been trying to prepare a SNP file for my NGS analysis. Basically the file should have following information, rsId, chromosome, position, allele, allele freq, counts, population, minor alleles and MAF. And the information should be based on hg 19 (build 37.3) and should have the population information.

  1. HapMap releases (rel 28 & 27) has based on NCBI build 36 and dbSNP b126. And moreover, BioMart-Martview allows only rel 27 retrieval. If i want to use rel 28, ll have to parse the data from ftp site and create one of my own.

  2. 1000 Genome project says "The 1000 genomes snp and short indel all get submitted to dbSNP and are available from version 132".

  3. Latest release of dbSNP (135) has 1000 Genome data annotated to it with population information.

So now if i want to have the latest SNP information that i have mentioned above which database should i go for?

hapmap genome dbsnp • 5.0k views
ADD COMMENTlink modified 5.2 years ago by Biostar ♦♦ 20 • written 7.1 years ago by Madhan220
1

If you want the most complete, 1000g phase I, definitely.

ADD REPLYlink written 7.0 years ago by lh331k

Complete genomics is going to release a fair amount publicly available of genomes soon.

ADD REPLYlink written 7.0 years ago by Zev.Kronenberg11k

Thks, but 1000g has already been integrated with dbSNP 135, but i wasn't sure whether they annotated population info along with SNP information. Currently, i am looking in to their ftp site to see if i could retrieve that. Sure, ll see the Complete Genomics data once it released!

ADD REPLYlink written 7.0 years ago by Madhan220

If you are using Illumina Omni2.5 chips there is now the HapMap data available from those chips (may have to contact Illumina directly). You cna always do lift-overs of coordinates from build 36 to 37 if necessary as well.

ADD REPLYlink written 7.0 years ago by Dan Gaston7.1k

Thks Dan, Sure, i ll also check with Illumina to see if they have the HapMap data.

ADD REPLYlink written 7.0 years ago by Madhan220
3
gravatar for Laura
7.0 years ago by
Laura1.7k
Cambridge UK
Laura1.7k wrote:

The low coverage snps from 1000 genomes most recent release are available in dbSNP though the allele frequencies they quote are actually from a previous release 20101123.

For the most complete variant data set I would suggest using the 40million variants 1000 genomes has which are genotyped in 1094 individuals based on both low coverage and exome sequencing and also contain short indel and large deletion calls aswell

This can be found ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20110521/

ADD COMMENTlink written 7.0 years ago by Laura1.7k

Thanks Laura,

We are focusing mostly on HapMap data alone to get the population specific allele frequencies. I will look in to your suggestions.

ADD REPLYlink written 7.0 years ago by Madhan220

The 1000 genomes data coverage a much larger number of variants and a very similar number of individuals so our allele frequencies should hopefully be more accurate than the hapmap ones

ADD REPLYlink written 7.0 years ago by Laura1.7k
2
gravatar for Madhan
7.0 years ago by
Madhan220
United States
Madhan220 wrote:

This is what exactly i was looking for,

"Biomart has all the data that you need (i.e. SNP information mapped to GRCh37), plus an archive of past mappings. you may have incorrectly landed on one of these, but if you go to biomart.org, select MartView, choose database "Ensembl Variation 59", and choose dataset "Homo Sapiens Variation (dbSNP131)" you will surely be working with up to date information. – Jorge Amigo"

http://biostar.stackexchange.com/questions/2337/getting-the-alleles-of-specific-snps-for-37grch37

Thanks Jorge!

ADD COMMENTlink written 7.0 years ago by Madhan220
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1176 users visited in the last hour