Question: Clarification On P-Value Threshold For Gwas
gravatar for mlscmahe
6.6 years ago by
mlscmahe80 wrote:

I have an Illumina GWAS data set with ~900 samples and 10 quantitative traits related to Obesity (BMI,weight, waist circumference), Diabetes (PGL), Hypertension (SBP, DBP) and Lipid profiles (TGL,T.Cholestrol, HDL,LDL). I use PLINK for QC and statistical association analysis. QC is performed as per standard protocols. After performing QC I have done LD pruning using PLINK to do PCA using EIGENSTRAT.

Original data set had ~7,00,000 (close to 1 million) SNPs, after doing QC, it turned out to be ~6,00,000 SNPs. Following this, LD pruning reduced SNPs to ~3,00,000. I have done statistical association analysis on two data sets, one with ~6,00,000 SNPs and other with 3,00,000 SNPs (LD pruned set) separately. While doing statistical association analysis (--linear) I have adjusted for Age, Sex and first 10 Principle components.

However, I have some confusion on p-value threshold calculation. I have seen couple of links where they say 0.05 / number of snps would give p-value threshold. But, do I have to consider 12 covariates used in --linear for calculation of p-value threshold? I will be grateful if you can clarify this to me? Thanking you in anticipation.

p-value gwas genetics • 5.3k views
ADD COMMENTlink modified 6.6 years ago by Charles Warden7.8k • written 6.6 years ago by mlscmahe80
gravatar for Charles Warden
6.6 years ago by
Charles Warden7.8k
Duarte, CA
Charles Warden7.8k wrote:

I would be interested to see what others say, but I would say "no".

For example, consider a 2-way ANOVA (perhaps between tumor and normal expression, considering individual patient pairing). Here you have two factors (tumor status and patient ID). You could imagine something like a 12-way ANOVA (although I can't imagine a large enough dataset to justify correcting for 12 expression variables).

The multiple hypothesis correction is based upon the number of tests. In that respect, a 1-way or 2-way (or hypothetical 12-way) ANOVA all have the same number of corrections (the total number of tests, which is typically the number of genes for a gene expression study, the number of SNPs in a GWAS study, etc.) For gene expression, an FDR correction is more typical than the Bonferroni correction that you described, but I agree this more stringent criteria is more appropriate in this case.

I don't believe you actually conducted more tests (you just conducted a test that compares multiple variables at the same time). If this is true, the answer is "no"

ADD COMMENTlink modified 6.6 years ago • written 6.6 years ago by Charles Warden7.8k

Agree with cwarden45. You could also use P < 5 × 10−8, a "normal" threshold in GWAS study.

ADD REPLYlink written 6.6 years ago by Maxime Lamontagne2.2k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 801 users visited in the last hour