9.8 years ago by
Nantes (France)
I am not sure how a peak can be defined easily just with SNP p-value result. If you are looking for a method that could give a correspndance pb-position/probability of location of the functional sNP, I'd advise running some hapltype clustering test that will translate the fuzzy peak shape into a linkage like plot (HapCluster from Balding - extensively coded by Mailund, or ancesHC by Coin). Gives you a credibility interval.
But you need first to ensure that your significant SNPs scattered throuhg the region represent one single association signal - maybe using nested likelihood tests like Thesias.
Finally, some people use imputation. However, this is not very different in essence from hapl clustering and make the assumption that all the SNPs are present in the panel.
Finally, you can list all you "significant" SNPs - need to choose a p-value threshold- , list the blocks they are lying in (block limits are to be found on IMPUTE software page) and take the union of these blocks as the area of the peak (because the functional SNP is post likely in high LD with your observed significant SNPs and can only be found within the same block). Beware that all this is under the assumption that your association is driven by a common variant.
If your goal is not fine-mapping but providing an additional evidence for true signal based on the "length" of a peak (something similar to what was proposed for linkage in a 90s Terwilliger paper), then haplotype clustering is not bad.
However, I am not sure here the length of a peak is really reflecting the true positiveness of an association as it is highly dependent on the recombination pattern of the region - a peak in a short recombination block can show very significant p-values (and be a true positive) but association (hence peak length) would fade very quickly once the recombinaiton hot-spots are met.
Christian Dina
•
link
modified 9.0 years ago
•
written
9.8 years ago by
Genotepes • 950
With most GWAS result sets, you can count the genome-wide significant SNPs on one hand. Presumably your results are more complex. Can you provide details about the rough shape of the plot: many peaks or few, focal or broad, close enough that you can't guess whether one peak is really two, etc? What are the factors that make this hard?
I have added a bit of clarification. As of now, I have a script that seems to do this ok. It takes a seed parameter and extends out from peaks allowing to skip a certain number of lower values. I'm still interested to hear more traditional approaches.