How to adjust data for "continuous" covariates ?
2
3
Entering edit mode
10.2 years ago
housebie ▴ 50

We know packages like SVA and ComBat, which can be used to adjust data for batch effects or biological artifacts. However, in most cases, all adjusting variables are treated as factors. So I want to know whether anyone could suggest me any package where in I could adjust for the "continuous" variables/covariates. I have multiple of them to adjust for. Could anyone suggest me an easy-to-use method?

I have also read about "cbcbSEQ" with a modified ComBat function (Post: How To Adjust Data For Confounding Covariates Before Pca?). Has anyone tried it for continuous variables?

R • 15k views
ADD COMMENT
0
Entering edit mode

In R you can just include the values of the continuous variable in your design matrix

ADD REPLY
0
Entering edit mode

@russ_hyde.. thanks for your reply. But I do not quite understand how do I do that. I do not want to include these covariates in my linear model. I would like to adjust for them separately in my data and then work with the adjusted residual matrix.

ADD REPLY
10
Entering edit mode
10.2 years ago

So maybe fit a linear model through your response variable as a function of the covariates you want to correct for. Then extract the residuals from the model and use those as your new response variable. E.g.:

## Example data:
n<- 100
dat<- data.frame(y= c(rnorm(n= n, mean= 1), rnorm(n= n, mean= 2)), sex= rep(c('A', 'B'), each= n), age= c(rnorm(n= n, mean= 1), rnorm(n= n, mean= 5)))

## Response y varies with age:
cor.test(dat$y, dat$age)

## Fit model:
lm1<- lm(y ~ sex + age, data= dat)

## Now residuals are not correlated with age:
cor.test(lm1$residuals, dat$age)

## Residuals vs original
plot(lm1$residuals, dat$y)
## ... Now use lm1$residuals instead of dat$y

Does it make sense?

ADD COMMENT
0
Entering edit mode

Thanks a ton @dariober.

ADD REPLY
0
Entering edit mode

that sounds like a problem you should solve

ADD REPLY
0
Entering edit mode

@dariober I just have another question here. In this case, sex is the phenotype and not the covariate we want to correct for. So does this mean that, residuals that we are able to get here, are independent of the phenotype (in this case: sex)? And only depends on the covariates?

So in other words, giving another phenotype here, instead of sex, could have not altered the residuals at all??

ADD REPLY
0
Entering edit mode
 ## ... Now use lm1$residuals instead of dat$y

Would you please be able to explain Why this residuals are used as new phenotype and how?

ADD REPLY
0
Entering edit mode

@Bioinformatics_NewComer, Did you figure this one out ? Using residual as adjusted value does not make sense to me. Thanks.

ADD REPLY
0
Entering edit mode

@dariober, I am new to this concept and have one question for you. Do we take negative residuals as it is or get the magnitude of residuals as adjusted data. Thanks

ADD REPLY
1
Entering edit mode
5.6 years ago
fansili2013 ▴ 30

I am not sure if your answer if correct.

Please see

https://stats.stackexchange.com/questions/286850/linear-regression-confounder

ADD COMMENT
0
Entering edit mode

In case that URL goes out of date, could you expand on your criticisms of @dariober s answer please @fansil2013

ADD REPLY

Login before adding your answer.

Traffic: 882 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6