Question

Limma design and makeContrasts

4

Entering edit mode

6.1 years ago

Matina ▴ 250

Hi,

I have data from 3 different conditions (normal, breast cancer, endometrial cancer) and different cell types (cd163_positive or negative) for each condition. I am interested in differentially expressed genes between conditions (e.g Breast_cancer vs Endometrial_cancer) but also between cell populations within the same condition but also across conditions (e.g Breast_cancercd163_positive-Breast_cancercd163_negative). Below is the pheno table and the code for making the contrasts.

I am not sure how to make the (simple) contrasts : Breast_cancer vs Endometrial_cancer , Breast_cancer vs Normal, Endometrial_cancer vs Normal using the model below.
Is the model below correct for the population comparisons ?

pData

Condition Population

Normal cd163_positive

Normal cd163_negative

Normal cd163_positive

Normal cd163_negative

Normal cd163_positive

Breast_cancer cd163_positive

Breast_cancer cd163_negative

Breast_cancer cd163_positive

Breast_cancer cd163_negative

Breast_cancer cd163_positive

Endometrial_cancer cd163_positive

Endometrial_cancer cd163_negative

Endometrial_cancer cd163_positive

Endometrial_cancer cd163_negative

Endometrial_cancer cd163_positive

Endometrial_cancer cd163_negative

f <- paste(pData$Condition,pData$:Population,sep="")
f <- factor(f)
design <- model.matrix(~0+f)
colnames(design) <- levels(f)
> colnames(design)
[1] "Breast_cancercd163_negative"      "Breast_cancercd163_positive"      "Endometrial_cancercd163_negative" "Endometrial_cancercd163_positive"
[5] "Normalcd163_negative"             "Normalcd163_positive"   

cont.matrix = makeContrasts(
  BRC_posVsBRC_neg = Breast_cancercd163_positive-Breast_cancercd163_negative, 
  Norm_posVsNorm_neg = Normalcd163_positive-Normalcd163_negative,            
  END_posVsEND_neg = Endometrial_cancercd163_positive-Endometrial_cancercd163_negative, 
  BRC_posVsNorm_pos = Breast_cancercd163_positive-Normalcd163_positive,       
  BRC_posVsEND_pos = Breast_cancercd163_positive-Endometrial_cancercd163_positive, 
  END_posVsNorm_pos  = Endometrial_cancercd163_positive-Normalcd163_positive,     
  BRC_negVsNorm_neg  = Breast_cancercd163_negative-Normalcd163_negative, 
  BRC_negVsEND_neg = Breast_cancercd163_negative-Endometrial_cancercd163_negative, 
  END_negVsNorm_neg = Endometrial_cancercd163_negative-Normalcd163_negative, 
  levels=design)

v=voomWithQualityWeights(d, design = design,plot = TRUE)
vfit <- lmFit(v, design)
vfit <- contrasts.fit(vfit, contrasts=cont.matrix)
efit <- eBayes(vfit)

I know this is fairly simple but I am a bit confused. Thank you very much!

limma design makeContrasts • 6.5k views

ADD COMMENT • link updated 6.1 years ago by theobroma22 ★ 1.2k • written 6.1 years ago by Matina ▴ 250

score 3 · Answer 1 · 2018-03-27

3

Entering edit mode

6.1 years ago

Kevin Blighe 87k

Are you just looking for confirmation of the method that you're using? There does not appear to be any particular problem.

Note that each of your comparisons in your contrast matrix will produce different statistics. You can access these via the topTable function and passing a coefficient value, as per Page 36 of the MANUAL.

ADD COMMENT • link 6.1 years ago by Kevin Blighe 87k

1

Entering edit mode

Thanks a lot Kevin!!

ADD REPLY • link 6.1 years ago by Matina ▴ 250

0

Entering edit mode

Is that okay then?

ADD REPLY • link 6.1 years ago by Kevin Blighe 87k

0

Entering edit mode

Yes I think the analysis seems fine, but my problem is still how to compare between conditions, regardless of the population column. Would it be ok to create another design and contrast matrix that I look for DEGs between i.e. Breast_cancer vs Normal?

f1 <- as.factor(pData$Condition)
design <- model.matrix(~0+f1)
colnames(design) <- levels(f1)
design

cont.matrix = makeContrasts(
  BCvsNorm = Breast_cancer-Normal,
  ENDvsNorm = Endometrial_cancer-Normal,
  BCvsEND = Breast_cancer-Endometrial_cancer,
  levels=design)

Thanks a lot! Matina

ADD REPLY • link 6.1 years ago by Matina ▴ 250

0

Entering edit mode

Yes, that would also be fine, but then this says nothing about CD163, of course. Also, after you derive the statistics for the Breast_cancer-Endometrial_cancer comparison, you will have to filter the list a 2nd and 3rd time to only include genes that are also significant in Breast_cancer-Normal and Endometrial_cancer-Normal

There are many ways of doing these differential expression analyses, as you can see. The important thing is in interpreting the results based on how you conducted the analysis.

You can also 'adjust' for covariates (for example CD163) sing Limma. For more information on that, see Gordon's answer here: Question: Removing continuous covariate effects in limma analysis

ADD REPLY • link 6.1 years ago by Kevin Blighe 87k

0

Entering edit mode

Thank you Kevin,

The idea is that i dont see that many differences between the CD163 subpopulations and that is why i want to look only at the comparisons between conditions and rather at populations. I have never thought that it is necessary to filter the gene list for the Breast_cancer-Endometrial_cancercomparison. Do you mean to double check that for example the significantly up-regulated genes between Breast_cancer-Endometrial_cancer are also significantly different between the Breast_cancer-Normal comparison?

I really appreciate taking the time to reply to me. Thanks a lot!

ADD REPLY • link 6.0 years ago by Matina ▴ 250

1

Entering edit mode

Yes, my idea regarding Breast_cancer-Endometrial_cancer and Breast_cancer-Normal relates to a comment from a reviewer for a recent manuscript of mine (accepted for publication), which was based on the endometrial cancer TCGA data. The comment/question, applied to your situation, would be something like this:

If you find a gene that is statistically significantly different between breast and endometrial cancer, how can you be sure that it has any relation to cancer without also checking it's difference between breast cancer and normal?

I guess that it's just something about which you should be aware! - it may no exactly be relevant to your end goal.

ADD REPLY • link 6.1 years ago by Kevin Blighe 87k

score 0 · Answer 2 · 2018-03-29

0

Entering edit mode

6.1 years ago

theobroma22 ★ 1.2k

An alternative would be to get the normalized expression values of each group. This way you can do a lot more in terms of analysis with the datasets, in addition to determining the DEGs.

At the end of your program, see how the below code may help you :

vfit <- lmFit(v, design)
efit <- eBayes(vfit, trend=TRUE)
write.table(topTable(efit, n=Inf,    
adj.pvalue=1, ...), ...)

So, exclude the contrasts fit code, vfit <- contrasts.fit(vfit, contrasts=cont.matrix)

ADD COMMENT • link 6.1 years ago by theobroma22 ★ 1.2k

0

Entering edit mode

Hi theobroma22,

Thanks for your reply but I am not sure I get what you mean. How can I get the explicit comparison between Breast vs Normal regardless of the population?

ADD REPLY • link 6.1 years ago by Matina ▴ 250