PCA trasformation on Z scored data
1
0
Entering edit mode
3.8 years ago
camillab. ▴ 160

HI,

I have converted different bulk-RNAseq into Z scores and I want to compare them. Usually,for the PCA I log transform +1 pseudo count (log2 n + 1) and then I set center=TRUE and scale=FALSE. If I want to compute PCA for the genes shared across the different datasets (that have been already converted into Z-scores) do I have to log transform and set scale=F? Since the Zscored are standardized scores, I would not log transform and keep the scale=F. Is it correct?

thank you

Camilla

bulkRNAseq PCA scale logTRANSFORM • 3.1k views
ADD COMMENT
0
Entering edit mode

It depends on your downstream analysis. And you should think twice whether convert to z score respectively or on combined datasets.

ADD REPLY
0
Entering edit mode

Indeed.

ADD REPLY
2
Entering edit mode
3.8 years ago

Hey again,

If your data is already on Z-scale, I would set center = FALSE and scale = FALSE, and I would not apply any additional log transformation.

As you are looking at data across datasets, even on Z-scale there will likely be some batch effect(s).

Note that, in, e.g., metabolomics studies, we typically log[e] and then Z-scale the normalised data.

Kevin

ADD COMMENT
0
Entering edit mode

thank you! so for the scale, I guess my idea/hypothesis is correct but why would you set center= FALSE ? centering does not make the variable means centered at zero and how this would affect/ interfere the PCA if performed on Zscores?

ADD REPLY
1
Entering edit mode

I had assumed that the data is already centered from the initial transformation. You could check via mean(), hist(), and summary() (?)

ADD REPLY

Login before adding your answer.

Traffic: 2536 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6