Regress out covariate in boxplot
1
0
Entering edit mode
4 days ago
Nona ▴ 90

Good afternoon,

I have an outcome y and two covariates e.g. sex and age. I want to plot two ggplot2 boxplots with the outcome y and sex. However, I would like before that "regress out" the effect of age. I read online that it could be done via plotting the residuals or coefficients. Which method do you prefer? Would below code be correct?

fit <- lm(y~ age,df)
df$Residual <- residuals(fit)

ggplot(data = df, aes(x=sex, y=scale(Residual))?

Thank you!

ggplot2 • 651 views
ADD COMMENT
2
Entering edit mode
2 days ago
LChart 5.0k

Yes - kind of. Implicit in that y ~ age is also the intercept coefficient - which will be removed; that is, scale(Residual) will always give you standardized data (mean 0, variance 1). Depending on your outcome this may not make sense. You will also run into significant trouble if age and sex are correlated, since your linear model will "absorb" some of the sex effect into age.

Instead you could do something like:

fit <- lm(y ~ age + sex)

df.age_removed <- df
df.age_removed$age <- mean(df$age)   # fix the age across all samples
df.age_removed$y <- predict(fit, newdata=df.age_removed) + residuals(fit)

this "removes" the age effect by shifting each datapoint along the marginal y vs age curve, so the scale of the y data will not substantially differ from the original scale.

ADD COMMENT

Login before adding your answer.

Traffic: 2713 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6