Question about more complex makeContrasts from limma
Entering edit mode
4.2 years ago
tara ▴ 30

Hello, I am doing a RNA-Seq analysis with R using the limma package. I have RNA-Seq data of different mutant lines of a model organism which have two different phenotypes in comparison with the siblings as control. The samples are like that:

  • line 1, phenotype 1, mutant and sibling
  • line 2, phenotype 1, mutant and sibling
  • line 3, phenotype 2, mutant and sibling
  • line 4, phenotype 2, mutant and sibling
  • line 5, phenotype 2, mutant and sibling

That means line 1 and 2 have the similar phenotype 1 and line 3, 4, and 5 have the similar phenotype 2 which differ from phenotype 1. Let's say phenotype 1 has more cells than the control and phenotype 2 has less cells than the control.

I did the RNA-Seq analysis for each line, but now I want to compare the lines of one phenotype. I want to find the genes which are different in the lines with the same phenotype in comparison to the siblings. My makeContrast command looks like that:

cont.matrix <- makeContrasts(pheno1 = (line1.mut+line2.mut)-(line1.sib+line2.sib),
pheno2 = (line3.mut+line4.mut+line5.mut)-(line3.sib+line4.sib+line5.sib),

Finally I want to find genes (eg. genes involved in the cell cycle) which are in phenotype 1 up and in phenotype 2 down regulated or the other way around.

cont.matrix <- makeContrasts(pheno1vspheno2 = ((line1.mut+line2.mut)-(line1.sib+line2.sib)) -

I am not sure if a can do a contrast matrix like this. Do you think this will give me the genes I am interested in?

RNA-Seq R • 1.9k views
Entering edit mode
4.2 years ago
russhh 5.7k

Let's suppose theres a gene that is 2-fold upregulated in phenotype1 relative to it's siblings and that is also 2-fold upregulated in phenotype2 relative to it's siblings (regardless of the lines).

So this is a gene where the phenotype-induced change in expression is identical.

log2(2) = 1, so we expect line1.mut - line1.sib ~ 1, line2.mut - line2.sib ~ 1, line3.mut - line3.sub ~ 1 and so on.

Now your contrasts:

pheno1 = (line1.mut+line2.mut)-(line1.sib+line2.sib)
    =  (line1.mut - line1.sib) + (line2.mut - line2.sib)
    ~ 1 + 1 = 2

pheno2 = (line3.mut+line4.mut+line5.mut)-(line3.sib+line4.sib+line5.sib)
    =  (line3.mut - line3.sib) + (line4.mut - line4.sib) + (line5.mut - line5.sib)
    ~ 1 + 1 + 1 = 3

and as a result:

pheno1vspheno2 = ((line1.mut+line2.mut)-(line1.sib+line2.sib)) -
    = pheno1 - pheno2 ~ 2 - 3 = -1

So, although this gene was induced to the exact-same-extent in mutant vs sib in each of your individual lines, there appears to be a two-fold-lower induction in phenotype1 relative to phenotype2. So something is wrong with the maths underlying your contrasts.

To prevent this, you should normalise your contrasts by the number of lines:

pheno1 = (line1.mut+line2.mut) / 2 - (line1.sib+line2.sib) / 2

pheno2 = (line3.mut+line4.mut+line5.mut) / 3 - (line3.sib+line4.sib+line5.sib) / 3

pheno1vspheno2 = ((line1.mut+line2.mut)-(line1.sib+line2.sib)) / 2 - 
    ((line3.mut+line4.mut+line5.mut)-(line3.sib+line4.sib+line5.sib)) / 3
Entering edit mode

Thank you very much for your explanation. This helped me a lot!

Entering edit mode
4.2 years ago

You may want to double-check this but it seems to me that with:

pheno1vspheno2 = ((line1.mut+line2.mut)-(line1.sib+line2.sib)) -

You are looking for an interaction between mutant/sibling state and phenotype. I.e. genes that respond differently between mutant and sibling depending on whether they are in phenotype 1 or 2.

EDIT russhh in his nice answer points out that the contrasts should be averaged. So the above should in fact be (please check ok):

pheno1vspheno2 = ((line1.mut+line2.mut)/2 - (line1.sib+line2.sib)/2) -
    ((line3.mut+line4.mut+line5.mut)/3 - (line3.sib+line4.sib+line5.sib)/3)

If you want the difference between phenotype 1 and 2 regardless of mutant/sibling state you could use:

pheno1vspheno2 = (line1.mut + line2.mut + line1.sib + line2.sib)/4 - 
                 (line3.mut + line4.mut + line5.mut + line3.sib + line4.sib + line5.sib)/6,

I would suggest you add to your count matrix one or more a dummy genes with behaviour that you want to pick up as differential or not significant to test whether the contrasts do what you expect.


Login before adding your answer.

Traffic: 3401 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6