Question: Tag Snp Selection Of Snps With Best P-Values
7.5 years ago
Ryan D
Ryan D wrote:

We have several thousand GWAS SNPs which are significantly associated with disease in two populations (CHB and a Korean population). We want to select tag SNPs to genotype in a followup sample. If there are several SNPs which tag a set of significant SNPs, we want to choose the one that has the best P-value. NIEHS has a tool which can do this for candidate genes and it also has one which does functional SNPs using P-value priorities. I'm certain this tool is out there. But we want something that will select tag SNPs using Hapmap LD like these tools for the whole genome. Forgive me if I'm overlooking the obvious.

We have a little script that does this but it constantly requires re-formatting the data and giving it new LD data, so it would be nice to have a permanent solution to this which could use a simple format like the GWAS-P format on the sites mentioned above.



7.5 years ago
Larry_Parnell wrote:

I'd take a look to see if SNAP can do this. This is a feature (your request) that I would expect the Broad to have added to compliment the GWAS in which they participate. For the Korean population, use JPT data.

Those are beautiful LD plots from SNAP!

Thanks, Larry. It's kind of astonishing to me that this hasn't been built yet. Maybe there is room for our type of tool. The NIEHS servers above do a great job of tapping in to the Hapmap data (but not yet 1KG) and let you specify the LD of your discovery population as well as the population(s) you plan to genotype so that markers are in maximum LD in both. But without some kind of P-value priority it seems like there is a hole that remains to be filled.

If you have not yet checked - SNAP does allow one to pick tagSNPs from 1000G data.

6.8 years ago
Fayue1015 wrote:

Hi, I would like to add two cent points. If I am right, your goal is to do follow-up studies by only select some representative/tag SNPs from these several thousands SNPs, then you can use your current sample/panel from CHB and JPT to do Tag SNP selection, I think our work "fasttagger" link below can be an option.

You do not need to use Hapmap LD information, because you want to select tag SNP from the several thousands SNP.

If you really want to use Hapmap LD information, the process is like this:

use hapmap data, do the LD analysis, check the overlap of tag SNPs selected from Hapmap and your several thousands SNPs, then use the overlap SNPs to do follow up analysis.

