Complete Mitochondrial Sequence And Y-Chromosome Snp Data From Hapmap And Hgdp Populations
2
3
Entering edit mode
14.3 years ago
Kcheng76 ▴ 30

I'm looking for datasets to analyze including mitochondrial DNA sequence and Y chromosome SNP data from Hapmap and HGDP panels. Does anyone know where to obtain them?

hapmap • 5.8k views
ADD COMMENT
0
Entering edit mode

request for clarification, not at all clear to me: what do you want? a) the sequence of the chromosomes b) the SNP annotations from dbSNP c) GWA study data using tools including these d) GWA studies where these variations 'come up'?

ADD REPLY
2
Entering edit mode
14.3 years ago

you won't be able to retrieve mitochondrial DNA sequence from HapMap nor from HGDP panels, since they only typed chrMT and they didn't sequenced it. all you may be able to obtain are frequencies and genotypes for the particular loci typed on their samples. you will find the latest hapmap release on flat files by population and chromosome through the appropriate FTP site folder, and I guess that the best place to find HGDP data would be the official CEPH db site, from where you may browse their database or directly bulk download flat files by chromosome.

we used to retrieve such data for our population genetics web tool SPSmart, so let me just add a few lines here describing some findings that we've come through. the main problem for us, and I guess that it'll be the same one for other researchers, is that on both projects these chromosome data has been reported as biallelic, probably due to file format normalization in order to use the same one for every chromosome, and it is not well described how this biallelic situation should be treated. for that reason we have decided not to include chrY nor chrMT on our tool, and neither chrX since we also found biallelic calls for male samples which break all our frequencies and other population statistics indexes we calculate.

PS: if anyone has information about how to deal with these 3 special chromosomes data I'd be glad to start a discussion on this from scratch, since population geneticist will definitely benefit from it.

ADD COMMENT
2
Entering edit mode
14.3 years ago
Michael 56k

One way to get SNP annotations is to go via a BioMart installation, either ensembl BioMart or HapMap Hapmart. Both provide chromosome Y SNPs, only ensembl biomart provides both MT and chromosome Y SNPs.

Both Marts allow to filter by genotyping platform. In order to find a GWAS study that did genotype those SNPs I would search for studies using a genotyping platform platform that contains such SNPs and then ask for access to the genotyping data. Access GWAS data is generally governed by a strict privacy policy. The Welcome Trust Case Control Consortium provides access to GWAS data via an aplication process.

I put some biomart queries here as hyperlinks to serve as examples in case that is what this question is about. Maybe Jorge can comment better on the relevance of these annotations.

ADD COMMENT
0
Entering edit mode

Also, pls note that HapMap and ensembl rely on different versions of dbSNP and different genome builds, they are 'incompatible' if possible use the latest dbSNP version.

ADD REPLY
0
Entering edit mode

I guess kcheng should be the one telling us whether these datasets suit his needs or not. to be honest, I've never been completely confident on chrX, chrY and chrMT data, so all the work we've done with them had to be deeply thought, using the validated pipelines we knew that worked with the other chromosomes only when being sure that they were suitable (almost always these had to be modified to deal with the chromosome data nature, even depending on the chosen mart).

ADD REPLY
0
Entering edit mode

Very interesting. I only achieved 11 SNPs from allSNP150 UCSD in chrM.

chrom   chromStart  chromEnd    name    refNCBI observed
chrM    515 518 rs879104404 CA  -/CA
chrM    517 520 rs878880226 CA  -/CA
chrM    524 527 rs78907894  AC  -/AC
chrM    5132    5135    rs199476116 AA  -/AA
chrM    8042    8045    rs199474828 AT  -/AT
chrM    8271    8281    rs371604158 ACCCCCTCT   -/ACCCCCTCT
chrM    8281    8291    rs369704279 CCCCCTCTA   -/CCCCCTCTA
chrM    9205    9208    rs199476137 TA  -/TA
chrM    9487    9503    rs267606612 TCGCAGGATTTTTCT -/TCGCAGGATTTTTCT
chrM    14787   14792   rs207460005 TTAA    -/TTAA
chrM    16180   16183   rs371240719 AA  -/AA
ADD REPLY

Login before adding your answer.

Traffic: 1638 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6