RNAseq analysis with edgeR-interaction terms and comparison within and between groups
1
0
Entering edit mode
4.1 years ago
erwagner2 • 0

I am struggling with creating the appropriate model matrix for the type of comparisons I want to make. The data I have are from four different yeast strains, two conditions, and three replicates (total of 24 samples) (ex: Y184-YPD-rep1, Y184-NaCl-rep1, etc.). I want to make comparisons between the two conditions within a strain and make comparisons between the strains in each condition. I need to account for a strain-condition interaction as well as their additive effect in my model, so my first design was:

design.1<-model.matrix(~0 + strain*condition + rep)

but then the column names for this design are:

colnames(design.1)
 [1] "strainbcy1"                  "strainira2"                  "strainira2bcy1"              "strainY184"                
 [5] "conditionYPD"                "rep2"                        "rep3"                        "strainira2:conditionYPD"    
 [9] "strainira2bcy1:conditionYPD" "strainY184:conditionYPD"

which to me makes it seem like the NaCl condition isn't there, and I can't make a comparison with it?

So I've also tried this model, which accounts for only the interaction between strain and condition:

design.2<-model.matrix(~0 + strain:condition + rep)

and the column names makes it seem like everything is being accounted for

colnames(design.2)
 [1] "rep1"                         "rep2"                         "rep3"                        
 [4] "strainbcy1:conditionNaCl"     "strainira2:conditionNaCl"     "strainira2bcy1:conditionNaCl"
 [7] "strainY184:conditionNaCl"     "strainbcy1:conditionYPD"      "strainira2:conditionYPD"    
[10] "strainira2bcy1:conditionYPD"  "strainY184:conditionYPD"

But then moving on to modeling the dispersion with design.2, I get an error that I can't really figure out how to fix:

my.yglm.2<-estimateDisp(my.dge, design.2)
Error in glmFit.default(sely, design, offset = seloffset, dispersion = 0.05,  :
  Design matrix not of full rank.  The following coefficients not estimable:
 strainY184:conditionYPD

I've been messing around with a bunch other models, but nothing gives me the coefficient names I want in order to set the contrasts I want.

In another design, I use a grouping factor that combines the strain and condition into one (strain_condition -> listed like Y184_YPD) so I'm not considering the strain and condition separately, but I'm worried that it's not really accounting for the interaction I expect:

design.3<-model.matrix(~0 + strain_condition+rep)
colnames(design.3)
 [1] "strain_conditionbcy1_NaCl"     "strain_conditionbcy1_YPD"      "strain_conditionira2_NaCl"     "strain_conditionira2_YPD"      "strain_conditionira2bcy1_NaCl" "strain_conditionira2bcy1_YPD"
 [7] "strain_conditionY184_NaCl"     "strain_conditionY184_YPD"      "rep2"                          "rep3"

Any advice on where to go from here would be great!

RNA-Seq edgeR • 2.2k views
ADD COMMENT
0
Entering edit mode

You cross-posted this on Bioconductor: https://support.bioconductor.org/p/129194/

In future, when you do this, could you please mention this clearly in your question so that neither community duplicates efforts. We are volunteers here, after all. Thank you!

ADD REPLY
0
Entering edit mode
4.0 years ago
Ram 43k

This question has been answered on BioC support. https://support.bioconductor.org/p/129194/#129217 - I am marking this as solved.

ADD COMMENT

Login before adding your answer.

Traffic: 4027 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6