DEG analysis of RNA-seq data across multiple tissues and two conditions
2
1
Entering edit mode
6 months ago
BioinfGuru ★ 2.1k

Hi all,

Pretty straight forward question, I've not seen this approach so ...red flag.

My data set:

  • Over 100 samples
  • 5 tissues ( ~ 20 samples per tissue)
  • 2 batches (pigs from 2 farms, ~ 50-60 samples per batch)
  • 2 conditions (high and low feed efficiency, ~ 50-60 samples per condition)
  • All combinations of tissue/batch/condition contain 3-6 samples

The primary question is to find differentially expressed genes (DEGs) within each tissue, but I would also like to see if there is an overall set of DEGs across tissues.

Question: Can I treat "tissue" as a covariate? That is, try to get DESeq2/EdgeR to treat it as a batch effect and reduce the effect of the difference between tissues so that I can get an overall picture of the difference between the 2 conditions regardless of tissue type. It seems intuitively more powerful than getting DEGs for each tissue and visually comparing lists

Thanks, and apologies if this is a complete no-no. Kenneth.

RNA-seq EdgeR DEGs DESeq2 • 1.1k views
ADD COMMENT
2
Entering edit mode
6 months ago
LChart 4.3k

I don't think this approach will give you what you want. Combining multiple tissues in the way you are suggesting will result in a positive set consisting of genes strongly differentially expressed in one single tissue, genes moderately differentially expressed in a few tissues, and genes mildly differentially expressed across all tissues.

It seems to me the null that you want is "0 change in one or more tissues" (so that the alternative is a change in all tissues) -- and the easiest way to get that is (indeed) to generate the venn diagram (or UpSet) of DE genes.

And that's before getting into the issue that including tissue as a covariate only adjusts for changes in mean expression -- but the variance will also differ (substantially) across tissues, which leads to a violation homoskedasticity assumptions.

ADD COMMENT
0
Entering edit mode

Thanks LChart, I accept that, I'll wait to see if there's any more who want to input and then accept the answer

ADD REPLY
2
Entering edit mode
6 months ago

I would not put totally different tissues in the same DESeq object. I don't think that's going to do good things for normalization or dispersion estimates.

Make each tissue its own object, make your design batch + condition and compare conditions to each other. Then compare gene lists between tissues.

ADD COMMENT
0
Entering edit mode

great guidance ... thank you

ADD REPLY

Login before adding your answer.

Traffic: 1671 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6