Hi all,
I have metagenome data. I aggregate raw counts per KO × sample. I want to do differential abundance between two time groups using DESeq2. After that, I want to show abundance heatmaps and volcanot plot.
My first question is about DESeq analysis. Is it valid to use DESeq2 on KO-level metagenome counts? I don’t have a “true control,” so I plan to set one group as the reference and interpret log2FC relative to that.
Second, is it acceptable to plot a heatmap using VST-transformed values? Alternatively, I could take the top 50 significant KOs from DESeq2, extract their CPM values, and plot a CPM heatmap, but I expect the visual patterns to differ a bit because VST and CPM are different scales.
Thank you very much.
The DESeq2 developer has advised many times against DESeq2 for metagenomics. Please search for related posts over at support.bioconductor.org where he advised for alternatives.
Okay, I will take a look. But I have also come across a lot of recent publications where people are using DESeq2 for metagenomics. So I'm a little confused...
Because he never tested DESeq2 for metagenomics and is not an expert in this field: https://support.bioconductor.org/p/128871/
There is no consensus in this field. Here are a couple of papers that may help you determine which methods work best for your data: 1) https://pmc.ncbi.nlm.nih.gov/articles/PMC10461514/; 2) https://pubmed.ncbi.nlm.nih.gov/32746888/