Count number of times a value appear in correlation matrix
0
0
Entering edit mode
7 weeks ago
ali ▴ 20

Hi everyone, I might be not really good with R but I really don't get this thing : I have a matrix of 2590 x 2590 genes (correlation matrix) with values between -1.0 and 1.0 and with a diagonal of all 1.0s , easily obtained with the following function (that returns 2590) :

sum(diag(cor.matrix))


Now when I do further analysis such as ( Fisher transformation ) I get 1640 infinite (Because of the division on zero) but this means that he only see 1640 1.0s instead of 2590, in fact if I do the following :

sum(cor.matrix==1)


It instead gets wrong telling me they are only 1640, and thats a huge problem, how is possible that I get 2590 1.0s but it sees only 1640 with another function?.

table(unlist(cor.matrix))


I get actually that 2572 are 1.0s and 12 are 0.99999, but still I don't get why 1640.

correlation R cor • 491 views
0
Entering edit mode

Please show the output of str(cor.matrix). Also, how did you produce cor.matrix?

0
Entering edit mode

cor.matrix was generated by the function cor(t(data)) where data has genes as rows and samples as columns. And below is the output of str(cor.matrix) :

>  str(cor.matrix)
num [1:2590, 1:2590] 0 0.6554 0.0337 0.0265 -0.3639 ...
- attr(*, "dimnames")=List of 2
..$: chr [1:2590] "ENSG00000104763" "ENSG00000023697" "ENSG00000108848" "ENSG00000159140" ... ..$ : chr [1:2590] "ENSG00000104763" "ENSG00000023697" "ENSG00000108848" "ENSG00000159140" ...

0
Entering edit mode

I think you should explain your "Fisher transformation", which code did you use ?

0
Entering edit mode

The Fisher transformation is applied after so it has nothing to do with it ( my bad I mentioned it ). My concern is why and how is possible that when I type table(unlist(cor.matrix)) It returns me that I have 2572 "1" and 12 "0.99" (that are those in the Diagonal) and when I type sum(cor.matrix==1) instead it gives me 1640?

0
Entering edit mode

Give us an example subset, dput(cor.matrix[1:10, 1:10]), ensure this includes both 1s and 0.99s rows/cols. By the way unlist does nothing on matrix, matrix is not a list.

0
Entering edit mode

Thank you for your reply , actually there are no 0.99 in the diagonal as I checked with diag(), so here is a subset of 10x10 :

> cor.matrix
ENSG00000112077 ENSG00000156381 ENSG00000104903 ENSG00000115457 ENSG00000206549 ENSG00000137267 ENSG00000185340 ENSG00000170323
ENSG00000112077     1.000000000     -0.19190013      -0.1966784     -0.13290891     0.002734867     -0.08534596     -0.17967079       0.4426268
ENSG00000156381    -0.191900129      1.00000000       0.9095771      0.51786695    -0.022457054     -0.16575985      0.93824489      -0.2583800
ENSG00000104903    -0.196678419      0.90957714       1.0000000      0.53893050    -0.023318702     -0.22419045      0.90572245      -0.2714970
ENSG00000115457    -0.132908908      0.51786695       0.5389305      1.00000000    -0.124795340     -0.04742099      0.46764793      -0.1230762
ENSG00000206549     0.002734867     -0.02245705      -0.0233187     -0.12479534     1.000000000     -0.17214213      0.06069934      -0.1768969
ENSG00000137267    -0.085345957     -0.16575985      -0.2241905     -0.04742099    -0.172142132      1.00000000     -0.14076607       0.1460344
ENSG00000185340    -0.179670795      0.93824489       0.9057225      0.46764793     0.060699339     -0.14076607      1.00000000      -0.2571831
ENSG00000170323     0.442626760     -0.25838000      -0.2714970     -0.12307625    -0.176896861      0.14603436     -0.25718315       1.0000000
ENSG00000101425     0.902741312     -0.16307465      -0.1615747     -0.02066323     0.043884391     -0.09102911     -0.16472355       0.4715138
ENSG00000103316     0.288357227     -0.37126758      -0.3712558     -0.39090657     0.177580855     -0.22241386     -0.38319491       0.1151637
ENSG00000101425 ENSG00000103316
ENSG00000112077      0.90274131       0.2883572
ENSG00000156381     -0.16307465      -0.3712676
ENSG00000104903     -0.16157472      -0.3712558
ENSG00000115457     -0.02066323      -0.3909066
ENSG00000206549      0.04388439       0.1775809
ENSG00000137267     -0.09102911      -0.2224139
ENSG00000185340     -0.16472355      -0.3831949
ENSG00000170323      0.47151380       0.1151637
ENSG00000101425      1.00000000       0.1689805
ENSG00000103316      0.16898054       1.0000000


Diagonal for easier visualization :

diag(cor.matrix) #diagonal
ENSG00000112077 ENSG00000156381 ENSG00000104903 ENSG00000115457 ENSG00000206549 ENSG00000137267 ENSG00000185340 ENSG00000170323 ENSG00000101425
1               1               1               1               1               1               1               1               1
ENSG00000103316
1


> ncol(cor.matrix) #length col
[1] 2590
> nrow(cor.matrix) #length row
[1] 2590
> summary(colnames(cor.matrix)==rownames(cor.matrix))
Mode    TRUE
logical    2590
> sum(diag(cor.matrix))
[1] 2590


sum(diag(cor.matrix)) says 2590 that means that all the diagonal present the number 1.000000 , so I dont get how is possible I retrive 1640 from the function sum(cor.matrix==1).

0
Entering edit mode

Please provide the output of dput(df)instead of printing the data frame, it is difficult to recreate your object for us otherwise

0
Entering edit mode
    > dput(cor.matrix)
structure(c(1, -0.191900129370209, -0.19667841893541, -0.132908907990152,
0.00273486675878516, -0.0853459574505116, -0.179670794954244,
0.442626760071414, 0.90274131194274, 0.288357226535195, -0.191900129370209,
1, 0.9095771369306, 0.517866950532572, -0.0224570543574054, -0.165759851161471,
0.938244885408117, -0.258380001065272, -0.163074651868828, -0.371267577550608,
-0.19667841893541, 0.9095771369306, 1, 0.538930500368628, -0.023318702025332,
-0.22419045064296, 0.905722451573056, -0.271497014948629, -0.161574724589497,
-0.371255805716392, -0.132908907990152, 0.517866950532572, 0.538930500368628,
1, -0.124795340475141, -0.0474209907792123, 0.467647928811358,
-0.123076247648689, -0.0206632281207981, -0.390906566893226,
0.00273486675878516, -0.0224570543574054, -0.023318702025332,
-0.124795340475141, 1, -0.172142131502174, 0.0606993394920817,
-0.176896861113167, 0.0438843914074129, 0.177580854744091, -0.0853459574505116,
-0.165759851161471, -0.22419045064296, -0.0474209907792123, -0.172142131502174,
1, -0.140766069991547, 0.14603436439314, -0.0910291108101076,
-0.222413862352161, -0.179670794954244, 0.938244885408117, 0.905722451573056,
0.467647928811358, 0.0606993394920817, -0.140766069991547, 1,
-0.257183149502409, -0.164723552108329, -0.383194913178383, 0.442626760071414,
-0.258380001065272, -0.271497014948629, -0.123076247648689, -0.176896861113167,
0.14603436439314, -0.257183149502409, 1, 0.471513802606508, 0.115163716714095,
0.90274131194274, -0.163074651868828, -0.161574724589497, -0.0206632281207981,
0.0438843914074129, -0.0910291108101076, -0.164723552108329,
0.471513802606508, 1, 0.168980542895039, 0.288357226535195, -0.371267577550608,
-0.371255805716392, -0.390906566893226, 0.177580854744091, -0.222413862352161,
-0.383194913178383, 0.115163716714095, 0.168980542895039, 1),
.Dim = c(10L,
10L),
.Dimnames = list(c("ENSG00000112077", "ENSG00000156381",
"ENSG00000104903", "ENSG00000115457", "ENSG00000206549", "ENSG00000137267",
"ENSG00000185340", "ENSG00000170323", "ENSG00000101425", "ENSG00000103316"
), c("ENSG00000112077", "ENSG00000156381", "ENSG00000104903",
"ENSG00000115457", "ENSG00000206549", "ENSG00000137267", "ENSG00000185340",
"ENSG00000170323", "ENSG00000101425", "ENSG00000103316")))

0
Entering edit mode

This example data give the 10 for both, provide bigger example data where this is not true. Or upload the full file somewhere and share the link:

sum(diag(cor.matrix))
# [1] 10
sum(cor.matrix==1)
# [1] 10