Which One Should I Use Hapmap Or 1000Genome Or Dbsnp?
2
2
Entering edit mode
12.2 years ago
Madhan ▴ 250

Hi,

I have been trying to prepare a SNP file for my NGS analysis. Basically the file should have following information, rsId, chromosome, position, allele, allele freq, counts, population, minor alleles and MAF. And the information should be based on hg 19 (build 37.3) and should have the population information.

  1. HapMap releases (rel 28 & 27) has based on NCBI build 36 and dbSNP b126. And moreover, BioMart-Martview allows only rel 27 retrieval. If i want to use rel 28, ll have to parse the data from ftp site and create one of my own.

  2. 1000 Genome project says "The 1000 genomes snp and short indel all get submitted to dbSNP and are available from version 132".

  3. Latest release of dbSNP (135) has 1000 Genome data annotated to it with population information.

So now if i want to have the latest SNP information that i have mentioned above which database should i go for?

hapmap dbsnp genome • 7.1k views
ADD COMMENT
1
Entering edit mode

If you want the most complete, 1000g phase I, definitely.

ADD REPLY
0
Entering edit mode

Complete genomics is going to release a fair amount publicly available of genomes soon.

ADD REPLY
0
Entering edit mode

Thks, but 1000g has already been integrated with dbSNP 135, but i wasn't sure whether they annotated population info along with SNP information. Currently, i am looking in to their ftp site to see if i could retrieve that. Sure, ll see the Complete Genomics data once it released!

ADD REPLY
0
Entering edit mode

If you are using Illumina Omni2.5 chips there is now the HapMap data available from those chips (may have to contact Illumina directly). You cna always do lift-overs of coordinates from build 36 to 37 if necessary as well.

ADD REPLY
0
Entering edit mode

Thks Dan, Sure, i ll also check with Illumina to see if they have the HapMap data.

ADD REPLY
3
Entering edit mode
12.1 years ago
Laura ★ 1.8k

The low coverage snps from 1000 genomes most recent release are available in dbSNP though the allele frequencies they quote are actually from a previous release 20101123.

For the most complete variant data set I would suggest using the 40million variants 1000 genomes has which are genotyped in 1094 individuals based on both low coverage and exome sequencing and also contain short indel and large deletion calls aswell

This can be found ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20110521/

ADD COMMENT
0
Entering edit mode

Thanks Laura,

We are focusing mostly on HapMap data alone to get the population specific allele frequencies. I will look in to your suggestions.

ADD REPLY
0
Entering edit mode

The 1000 genomes data coverage a much larger number of variants and a very similar number of individuals so our allele frequencies should hopefully be more accurate than the hapmap ones

ADD REPLY
2
Entering edit mode
12.2 years ago
Madhan ▴ 250

This is what exactly I was looking for,

"Biomart has all the data that you need (i.e. SNP information mapped to GRCh37), plus an archive of past mappings. you may have incorrectly landed on one of these, but if you go to biomart.org, select MartView, choose database "Ensembl Variation 59", and choose dataset "Homo Sapiens Variation (dbSNP131)" you will surely be working with up to date information. – Jorge Amigo"

Getting The Alleles Of Specific Snps For 37:Grch37

Thanks Jorge!

ADD COMMENT

Login before adding your answer.

Traffic: 1898 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6