I have two questions about using TPM (transcripts per million). I've read some papers on the calculation and some blog and forum posts so I have some understanding of what it is. The true analysis for this experiment was with raw counts and vst expression values, and I'm basically just having a look at TPM out of interest.
1. Is it valid to calculate TPM from DESeq2's normalised counts, i.e.
counts(dds, normalized = TRUE), or do I have to use the raw, raw counts? I tried both and there didn't seem to be a great deal of difference (actually my TPM results aren't that different to using normalized raw counts for the genes I've looked at, in either case) but I haven't tested it thoroughly.
2. I understand why one shouldn't compare TPM between samples, since the total expression rates, rRNA component etc. varies sample-to-sample. I'm just wondering if this would be less of a problem in the case where data from three biological replicates were available?
Thanks for reading and have a nice Friday,