Question: Is it ok to use median instead of mean for phenotype data in GWAS studies?
gravatar for rimgubaev
4 months ago by
rimgubaev180 wrote:

Hello everyone! I wonder if somebody could suggest me an article related or the solution to the following problem. I got phenotype data collected for the plant of interest for three years to run GWAS. For some samples phenotype data failed to pass Shapiro-Wilk test for normality so I want to use the median phenotype values instead of the mean values, however, I failed to find articles and/or tutorials where the similar approach have been used. If you faced such problem or know such articles please suggest!

ADD COMMENTlink written 4 months ago by rimgubaev180

I wonder how many of phenotypes failed Shapiro-Wilk test (is it around 5%)? Will not it be "cheaper" to find a transformation of the values (e.g. Box-Cox transform)? In some cases even median will not help (imagine zero-inflated data) - you will need to apply linear models with special links to work with this data.

ADD REPLYlink written 4 months ago by German.M.Demidov1.6k

It's not so many actually, it is 7%.

ADD REPLYlink written 4 months ago by rimgubaev180

But you know that 5% of tests will be rejected with alpha = 0.05 just because statistical tests work like this? So you have 2% "unexpectedly non-normal data" - and there may be biological effects there, in these 2%. So, I'd think twice if I need to invent another method to deal with the data.

However, even saying this - pre-testing for normality is not always a recommended practice. 1) it may consider as normal data which is not obviously normal (e.g. mutli-modal data), 2) it creates additional burden of tests - and we know what happens when statistical tests mutliply

ADD REPLYlink written 4 months ago by German.M.Demidov1.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 852 users visited in the last hour