Question

Can I normalize my data twice?

0

Entering edit mode

7.4 years ago

iside ▴ 20

Dear all,

I am working on a dataset containing 700 samples and 50000 genes, which has been rank normalized. For my purposes, I calculated the Pearson's residuals of each gene including covariates and technical confounders is the model. The data I produced has not a normal distribution though. Bear in mind that I checked the normality for each gene with Shapiro test, which is very sensitive in case of a big dataset (do you think that 700 observations is a big dataset in this case?) and could detect deviations from normality which do not actually influence the results. Therefore the data might also be fine after all. I was wondering if it advisable to normalize this data again, or if it is not necessary. I searched the internet looking for examples or an explanation on the use of a second normalization step, but I could not find anything useful.

I would really appreciate any answer and comment on this.

Best Wishes

Normalization Data analysis Normal distribution • 5.2k views

ADD COMMENT • link 7.4 years ago by iside ▴ 20

0

Entering edit mode

Rank normalization itself is very stringent ,so it should take care of everything. (By Rank Normalization,I am assuming every gene in a sample is forced for a value between 0 and 1).

ADD REPLY • link 7.4 years ago by Ron ★ 1.2k

0

Entering edit mode

Hi Ron,

thanks for your comment! Actually the values are not between 0 and 1 but between -3 and +3, I am not sure how the normalization has been done exactly as I got this file as it is...

ADD REPLY • link 7.4 years ago by iside ▴ 20