Question

In GWAS what is the point of regressing a quantitative phenotype on covariates first and taking the residuals?

1

Entering edit mode

15 months ago

curious ▴ 750

I understand that rank-based inverse normal transformation of a non-normal quantitative phenotype helps make the trait more normal for linear regression and that this is common to do

But sometimes I read about folks taking first regressing the quantitative phenotype on the covariates, taking residuals, rank-based inverse normal transformation of the residuals, then run the GWAS on that.

Why is this done?

gwas • 842 views

ADD COMMENT • link updated 15 months ago by LChart 3.9k • written 15 months ago by curious ▴ 750

1

Entering edit mode

IMO this is poor practice. The one case where it is plausibly justifiable is if the covariates impact the phenotype on the observed scale (i.e., non-normal) but not so much on the transformed scale (i.e., normal); so the transformation has to occur after the regression. However typically the r^2 of covariates is fairly low in the first place, so it's really hard to justify pre-transformation covariate regression.

At the same time, this approach is conservative in the sense that, if there are correlations between covariates and variants, the maximum proportion of the variance will be apportioned to the covariate as opposed to the variant. Typically degrees of freedom are enormous, so the fact that 15 or 25 or 50 degrees of freedom have been used (out of 750,000) won't matter much for p-values in GWAS (but definitely will in differential expression where DoF are much lower).

ADD REPLY • link 15 months ago by LChart 3.9k