Question: NaN in Z.score data from TCGA
0
gravatar for olive1212
10 months ago by
olive121260
olive121260 wrote:

I have downloaded RNA-Seq expression z.scores from TCGA datasets on cBP. For some genes, they have a z.score value of NaN. Does that mean that the expression level of that gene was 0? Or does it mean something else? I couldn't seem to find the answer online, thanks for your help!

zscore tcga • 339 views
ADD COMMENTlink written 10 months ago by olive121260
3
gravatar for German.M.Demidov
10 months ago by
Tübingen
German.M.Demidov1.8k wrote:

by default z-score is centering and then dividing with the standard deviation. My guess would be that standard deviation was 0. It occurs when the expession level is 0, but may also occur in other situations, however, everything except 0 looks unrealistic.

ADD COMMENTlink written 10 months ago by German.M.Demidov1.8k
2

Indeed, a value of 0 can be transformed, on the Z-scale, to anything, as 0 is still useful information. If we run a test and calculate Z-scores by global mean and standard dev.:

x
     col1 col2
[1,]    0  435
[2,]    5  346
[3,]    4   65
[4,]    4    3

(x - mean(x)) / sd(x)
           col1       col2
[1,] -0.6073070  1.8444661
[2,] -0.5791257  1.3428389
[3,] -0.5847620 -0.2409501
[4,] -0.5847620 -0.5903982

As kuckunniwid implies, there are other reasons why NaN was produced, likely constant expression values / zero variance. Here, we are going to Z-transform by row in a case where row 1 is all zeros, while row 4 has constant expression of 4:

x
     col1 col2
[1,]    0    0
[2,]    5  346
[3,]    4   65
[4,]    4    4

t(scale(t(x)))
           col1      col2
[1,]        NaN       NaN
[2,] -0.7071068 0.7071068
[3,] -0.7071068 0.7071068
[4,]        NaN       NaN
ADD REPLYlink written 10 months ago by Kevin Blighe63k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1177 users visited in the last hour