I have a clinical bulk RNAseq dataset with 3 different conditions/groups. I notice that if I use a standard workflow of scaling/centering the data before dimensionality reduction (PCA and tSNE), I get a messy plot of the patients. But when I log-transform the data first, the groups become distinct and tight.
Is this an artifact of my data? Is it generally better to log transform raw count data prior to scaling for dimensionality data?