DESeq2 on metagenome KO counts
1
0
Entering edit mode
22 days ago

Hi all,

I have metagenome data. I aggregate raw counts per KO × sample. I want to do differential abundance between two time groups using DESeq2. After that, I want to show abundance heatmaps and volcanot plot.

My first question is about DESeq analysis. Is it valid to use DESeq2 on KO-level metagenome counts? I don’t have a “true control,” so I plan to set one group as the reference and interpret log2FC relative to that.

Second, is it acceptable to plot a heatmap using VST-transformed values? Alternatively, I could take the top 50 significant KOs from DESeq2, extract their CPM values, and plot a CPM heatmap, but I expect the visual patterns to differ a bit because VST and CPM are different scales.

Thank you very much.

abundance KEGG KO deseq metagenome gene • 1.4k views
ADD COMMENT
2
Entering edit mode

The DESeq2 developer has advised many times against DESeq2 for metagenomics. Please search for related posts over at support.bioconductor.org where he advised for alternatives.

ADD REPLY
0
Entering edit mode

Okay, I will take a look. But I have also come across a lot of recent publications where people are using DESeq2 for metagenomics. So I'm a little confused...

ADD REPLY
1
Entering edit mode

So I'm a little confused...

Because he never tested DESeq2 for metagenomics and is not an expert in this field: https://support.bioconductor.org/p/128871/

There is no consensus in this field. Here are a couple of papers that may help you determine which methods work best for your data: 1) https://pmc.ncbi.nlm.nih.gov/articles/PMC10461514/; 2) https://pubmed.ncbi.nlm.nih.gov/32746888/

ADD REPLY
3
Entering edit mode
21 days ago

I wouldn’t use DESeq2 on aggregated raw counts per KO, because doing so means accepting two pretty big assumptions:

  1. All genes with the same KO have the same length, which usually isn’t true.
  2. KOs make up a small part of the total coding sequence. When DESeq2 normalises for sequencing depth, it’s assuming that the proportion of CDS with a KOs stays roughly the same across samples. If that’s not the case, the normalisation might not work well.
ADD COMMENT
0
Entering edit mode

Hım, that is indeed good point ! So, I’ll perform the DESeq2 at the gene level instead, and then map significant genes back to their corresponding KOs for interpretation.

ADD REPLY

Login before adding your answer.

Traffic: 6859 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6