I want to analyze RNA-seq datasets generated from different experiments, and compare the expression of selected genes across those samples. One of the experiment has 2 conditions (WT1, Drug1), and the other dataset has 4 samples (WT2, Drug2, Drug3, Drug4). The WT1 and WT2 samples are similar but not identical. What I want to do is to compare the expression of selected gene sets in Drug1 and Drug2 samples.
I used kallisto to quantify all 6 samples together, and used either DEseq2 or sleuth for cross-sample normalization. I could extract the cross-sample normalized TPM values from the sleuth or DEseq2. And now I want to look at the normlaized TPM in Drug1 and Drug2 sample to check if one of them is higher or lower. But I have a few questions:
- Does this approach sound reasonable, or are there better ways to do it?
- Because the datasets are from different labs/experiments, how do I know that the cross-sample normalization worked?
- I tried plotting boxplot of the TPMs extracted from DEseq2 and some of the medians are slightly different. Does that mean I need to do a different normalization approach?
Any suggestions and help is greatly appreciated.
Thank you! Urja