Sample example:
vector A: -2, -1, 0, 3, 5. And mean(A) =1; sd(A)=2.915476.
I'd want to get the z-score of A. it should be scale(A)=-1.0289915 -0.6859943 -0.3429972 0.6859943 1.3719887 But I know 0 in original A should also be 0 in the scaled vector as it represents no chances.
C=scale(A, center=F)
C will be -0.6405126 -0.3202563 0.0000000 0.9607689 1.6012815
but mean(C) and sd(C) will be 0.3202563 and 0.9336996 respectively. This is not z-score.
In other words, I need the 0 be 0 in the scaled vector D and the mean(D)=0, sd(D)=1. Thank you.
Update:
The ultimate goal is make the data obey normal distribution. The original data is the amplitude matrix data (http://www.connectivitymap.org/cmap/help_topics_frames.jsp, Control F(amplitude)). Thank you.
BTW, in LINCS, they use another method (Z-scoring procedure) (http://support.lincscloud.org/hc/en-us/articles/202099616-Signature-Generation-and-Analysis-L1000-). I'm not sure it is related or not.
When we speak of z-score, the mean is moved to zero. Because you want one data point to be unmoved, you cannot use a z-score. Please explain what you do want to use the scaled values for so that we can suggest alternative solutions.
@karl.stamm See update, please. Thank you.
What karl.stamm said and I'll add that you would need to scale asymmetrically for this to work...which is highly questionable.
@DevonRyan the value means the extent changed with positive value up-regulated and negative value down-regulated. So, I think scaling asymmetrically will be OK, though the number of DE genes will be changed based on a cutoff and the most significant DEGs are kept as DEG. However, how to do this kind of scaling asymmetrically?
I'm not entirely sure that there is a nice programmatic way. Ideally you'd just want to apply a nice function, but I've never seen such a thing for this purpose.