Generating random sets of LD-independent SNPs from GWAS and calculating an empirical p-value
I am trying to calculate 'empirical p-value' by-

drawing 10000 random sets of 5000 SNPs each from a GWAS summary statistics, where SNPs in each set must be LD-independent (LD<0.8, 1000G Phase 3 EUR ref) and, then counting the number of SNPs in each set that overcome a certain association threshold, and are 'hits'.

Might someone have an idea on how this should be done and to plot the results?

I assume the empirical p-value can then be calculated as: Number of tests where SNPs overcome association threshold p-value/ total number of tests (10000).

Thank you all.

