I would very much like feedback on this metatranscriptomic/DESEq2 issue from a biostat perspective.
I have 19 individual RNAseq samples corresponding to different locations in the environment. They were sequenced separately using Illumina Hiseq (barcoded), and then all reads were used to generate an assembled metatranscriptome.
Reads from each of the 19 samples were mapped to this metatranscriptome. Based on our annotations, we have many different microbial groups present.
What I would like to do is compare transcript abundance across locations for each major taxonomic group. I have subsetted out contigs and their raw counts corresponding to the groups A, B and C. I have normalized counts for each group in DESeq2 and exported the pseudo-counts.
I would like to generate transcriptome profiles (heatmaps) for each of these groups and look for changes in normalized transcript abundance across sites and among groups. The question is, is it incorrect to display these pseudo-counts side-by-side each other in the same heat map? (I think yes.) Is there a better way to do this if you wish to directly compare normalized transcript abundance across independently-normalized taxonomic groups?
I don't think the answer is to normalize groups A,B,C all together and then later subset them out, because there may be important differences in expression tendencies between groups.
I am not actually interested in a pairwise analysis here, mainly just the changes in transcript abundance with location and group.
Apologies if there is an obvious answer here. Thank you so much for your time and thoughts.