Entering edit mode
2.8 years ago
gradstudentNew ▴ 50
I normally use DESeq2 for normalization for analyses such as regressions but have only access to a TPM dataset (no raw counts available).
Does it make sense to use TPM for correlation between two genes? Or should I quantile normalize first?
Yes I'm aware. To clarify, I meant I would normally normalize my dataset with DESeq2, but because my dataset is TPM, I cannot do that so I was wondering if TPM is appropriate for correlation between two genes.
If you are looking for linear correlation such as Pearson then it should not matter too much which normalization you use since all these linear methods (per-million, RLE, TMM...) perform a linear scaling by a single factor. Quantile normalization is a different story as QN forces the distributions to be identical and this obviously changes the correlations between samples. It would of course be better to have a method such as tximport in your pipeline which corrects for the relative gene length depending on the isoforms that are being expressed but if you only have TPM then you do not have that choice.