Plink documentation explains two options when running a set-based test analysis.
HINT Two extremes are to perform a test based on a) the best single SNP result per set:
or to use all SNPs in a set:
--set-max 99999 --set-p 1 --set-r2 1
Could someone tell me what are the hypothesis that are you testing in each case?
I have several sets of SNPs (each set is a gene of interest with lots of SNPs) but I don't know which hypothesis are you testing when applying --set-max 1 or --set-max 9999
I imagine that if I want to know if a gene is significantly associated with my disease I should take all SNPs. In that case it gives me the average STAT from all them and calculate the EMP1. But maybe in that case I don't get any significant result when I may could get significant results testing just the best SNP for each gene. So what is the difference? Which hypothesis are you testing when testing for all SNPs or just the best one in each gene?
Thanks a lot!