Is it ok to use median instead of mean for phenotype data in GWAS studies?
0
0
Entering edit mode
23 months ago
rimgubaev ▴ 270

Hello everyone! I wonder if somebody could suggest me an article related or the solution to the following problem. I got phenotype data collected for the plant of interest for three years to run GWAS. For some samples phenotype data failed to pass Shapiro-Wilk test for normality so I want to use the median phenotype values instead of the mean values, however, I failed to find articles and/or tutorials where the similar approach have been used. If you faced such problem or know such articles please suggest!

GWAS association phenotype data • 294 views
ADD COMMENT
0
Entering edit mode

I wonder how many of phenotypes failed Shapiro-Wilk test (is it around 5%)? Will not it be "cheaper" to find a transformation of the values (e.g. Box-Cox transform)? In some cases even median will not help (imagine zero-inflated data) - you will need to apply linear models with special links to work with this data.

ADD REPLY
0
Entering edit mode

It's not so many actually, it is 7%.

ADD REPLY
0
Entering edit mode

But you know that 5% of tests will be rejected with alpha = 0.05 just because statistical tests work like this? So you have 2% "unexpectedly non-normal data" - and there may be biological effects there, in these 2%. So, I'd think twice if I need to invent another method to deal with the data.

However, even saying this - pre-testing for normality is not always a recommended practice. 1) it may consider as normal data which is not obviously normal (e.g. mutli-modal data), 2) it creates additional burden of tests - and we know what happens when statistical tests mutliply https://en.wikipedia.org/wiki/Multiple_comparisons_problem

ADD REPLY

Login before adding your answer.

Traffic: 1621 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6