Question: RNA Seq: Differential expression model formula
0
gravatar for rmf
3.2 years ago by
rmf730
rmf730 wrote:

My question is about the model formula used in differential expression analyses is applications such as DESeq2, edgeR, Sleuth etc.

I have a dataset which looks like so. There are more replicates but reduced here.

sample tissue family replicate condition
a11c     a        1        1           c
a12c     a        1        2           c
a21c     a        2        1           c
a22c     a        2        2           c
b11c     b        1        1           c
b12c     b        1        2           c
b21c     b        2        1           c
b22c     b        2        2           c
a11t     a        1        1           t
a12t     a        1        2           t
a21t     a        2        1           t
a22t     a        2        2           t
b11t     b        1        1           t
b12t     b        1        2           t
b21t     b        2        1           t
b22t     b        2        2           t

I have 2 tissues a and b for 2 treatments control and treated. And I also have families. I am not really interested in differentially expressed genes/transcripts (deg/det) between tissues. I am interested in deg/det between control and treated in both tissues. How is the correct way to create this model?

~tissue+condition
~tissue*condition
~tissue:condition

Since I am not that interested in degs between tissues, would it make sense to split the data into 2 datasets based on tissues and do it separately

subset(df,tissue=="a")
~condition
subset(df,tissue=="b")
~condition

Family is an additional variable that is not so critical nevertheless would be interesting to inspect. Can I just add that to the original model? Also, does the order matter?

~tissue+condition+family

Any other considerations for such analyses? Thanks.

edger rna-seq deseq R • 1.1k views
ADD COMMENTlink modified 3.2 years ago by Devon Ryan91k • written 3.2 years ago by rmf730
3
gravatar for Devon Ryan
3.2 years ago by
Devon Ryan91k
Freiburg, Germany
Devon Ryan91k wrote:

~tissue*condition, since while you may not care about things like the tissue effect, it'll still be there.

Regarding splitting, while you can do that, you'll have decreased power (there won't be as much variance shrinkage), so I would suggest that you keep everything in.

You can certainly add family in to any of the designs as you showed. If you do that, please ensure that family is a factor. I don't think it'll cause a problem as is for your current experiment, but if you have more than two families and don't ensure that that's a factor then you'll get some messed up results.

ADD COMMENTlink written 3.2 years ago by Devon Ryan91k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1269 users visited in the last hour