Hi all, I am a bit new to this forum and don't have a programming background (more of biology, but trying to learn bioinformatics a bit). This may sound dumb but I am trying to do something relatively simple and small in scope compared to what most people on this forum are doing - I just want to find haplotype blocks and tagging SNPs in the gene XRCC2, so I can do an association study focusing on this gene but minimizing the number of SNPs I have to type (<$).
I already downloaded the HapMap genotype data for the CEU population (HapMap rel27 B36) from the HapMap Genome Browser for XRCC2 (co-ordinates chr7: 151969353..152009352). But I am having issues to load this file into Haploview. I have read a few webpages/documents saying to just click on the "HapMap Format" button when Haploview opens, go to browse, selecting the file with the dumped region (genotype data) from HapMap, and hitting ok. But then I get an error message: "HapMap data format error: totalcount". Total Count is the last column in the file, which is supposed to be the total number of genotypes observed. As I have not changed anything in the HapMap dump file, I am not sure why it has a formatting error. Haploview's own documentation says this:
"HapMap Project Data Dumps Data from the HapMap Project can be dumped by region using the GBrowse interface. The saved data file is in a marker-per-line format which can be loaded in Haploview. GBrowse dumps only one file, which has one marker per line and which includes familial relationships among the HapMap samples as well as marker position information. The file format has several header lines (beginning with "#") which Haploview parses. Open the file by selecting "Browse HapMap Data" option and selecting the downloaded file."
I thought that GBrowse referred to the International HapMap Project Genome Browser webpage, but the only genotype data file I can see from rel27 B36 is under the Reports&Analysis dropdown where it says "Download SNP Genotype Data". When I opened the genotype data file in Excel, I can see that there are a number of SNPs in the rows with chr, pos, strand, build... etc... then genotype, genotype frequency, genotype count. I didn't see any header that looked like it was describing familial relationships... so do I have the wrong file here completely? Or am I supposed to modify this file so that Haploview can parse it?
Again, I am sure this is a bit of a dumb question but I am new to this type of thing and I would appreciate any help anyone can give me. Also sorry if this post is a bit long but I wanted to make it clear what I've already done/tried to do.