Entering edit mode
3 months ago
1215045934 ▴ 80
$TRINITY_HOME/util/align_and_estimate_abundance.pl from trinity to do transcript quantification for my RNAseq data. Then I got the following outputs:
I would like to plot the boxplots for several genes. Which one should I use. TMM or TPM?
I understand that TPM is not normalized across samples, but I did see lots of people using it in their papers for boxplots. The author of Trinity aslo said "and the normalized expression values (FPKM or TPM) are used almost everywhere else, such as plotting in heatmaps." (from the link above). I am a bit confused.
Thanks in advance!
What is the purpose of your box plots? Will the x-axis be different genes, or different conditions?
To show how that particular gene I am interesed in expressed in difference conditions. x-axis will be difference conditions. I was wondering if I should use TPM or TMM for the y axis.
I would probably use rlog/vst transformed counts (from the DESeq2 package, the function that calculates the values performs a normalisation). TMM works best when you have reason to believe that the average difference between conditions is zero, which is not the case here. You would probably get away with using TPM, but it wouldn't be strictly correct.
Thanks a lot! TMM is also normarlized across samples. What is the difference between TMM and results from rlog/vst transformed counts?
In my understanding TMM uses the assumption that the mean log fold change between conditions is zero. That seems an odd choice when you have multiple condition and aren't interested in doing differential expression. Rlog and vst aren't normalisation, but are rather transformations aimed at stabilising the variance, so that more highly expressed genes/ samples do not have a higher variance, which is useful when qualitatively/ informally doing comparisons. But as part of the calculation process, a cross sample normalisation is applied in the form of DESeq2's default normalisation.