Question: Limma package, how to correct by age and sex?
0
gravatar for ellen2270
10 months ago by
ellen22700
ellen22700 wrote:

Hello everyone,

I am analyzing microarray data with limma package and I have a couple of doubts as it is the first time I use it. I have 3 conditions which I want to compare. My data looks like this:

Condition   AGE SEX
0   64  Male
0   65  Female
1   67  Male
1   60  Male
2   58  Male
2   65  Female

I want to correct the results by sex and age as there are slightly differences in sex and age between groups. I have came with two different codes to correct by age/sex and I obtain different results. Do you think the code I am using is correct to adjust by age and sex. Which strategy is better?

  1. Introducing sex and age in the model:

    design <- model.matrix(~0+Condition+as.numeric(AGE)+SEX,targets)
    fit <- lmFit(y,design)
    cont.matrix <- makeContrasts(P1="1-0",
    P2=”1-2”,
    P3=”2-0”, levels=design)
    
    fit2  <- contrasts.fit(fit, cont.matrix)
    fit3  <- eBayes(fit2)
    topTable(fit3, coef=1, n=Inf, adjust="BH")
    
  2. Using the function RemoveBatchEffect

    y_correct<-removeBatchEffect(y,batch=(targets$SEX),covariates=(targets$AGE)
    design <- model.matrix(~0+Condition)
    fit <- lmFit(y_correct,design)
    cont.matrix <- makeContrasts(P1="1-0",
    P2=”1-2”,
    P3=”2-0”, levels=design)
    
    fit2  <- contrasts.fit(fit, cont.matrix)
    fit3  <- eBayes(fit2)
    topTable(fit3, coef=1, n=Inf, adjust="BH")
    

Can anyone confirm me which method would be more correct and confirm me that it make sense? Thankyou

ADD COMMENTlink modified 10 months ago by ATpoint31k • written 10 months ago by ellen22700
2

I am not a statistician so I add this as comment rather than answer: I think from a biological perspective these slight differences in AGE are probably of minor effect. All study participants are reaching or already reached an elderly status. You are not comparing kids to adults, so the confounding effect is probably limited (or absent). I also doubt that using AGE as a continuous variable makes sense at it assumes a somewhat linear influence of AGE on gene expression. Do you think this is justified? The SEX category might indeed influence the gene expression but you do not have any replicates in terms of SEX replication per group (e.g. 2 men, 2 women per group) so that you cannot really see if SEX creates additional variability that cannot be explained by the non-SEX variation. I would start by checking how the samples cluster by plotting the results from a PCA or MDS analysis and then decide if the effort really makes sense.

ADD REPLYlink modified 10 months ago • written 10 months ago by ATpoint31k

Thankyou for your input. I only had shown some of the data in the previous post but I think you are correct and maybe age doesn't influence so much. Regardless Sex, I have between 7 and 8 individuals per condition with different sex distribution (eg. condition 1: 5 males and 2 females, condition 2: 5 males and 3 females and condition 3: 5 males and 3 females), with this data and knowing that sex influences gene expression it maybe makes sense to continue adjusting the analysis by sex.

ADD REPLYlink written 10 months ago by ellen22700

Ok I see. As said, I would first check by PCA or MDS that there is indeed a confounding effect based on SEX.

ADD REPLYlink written 10 months ago by ATpoint31k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 975 users visited in the last hour