TOM matrix generated by WGCNA package in R
1
0
Entering edit mode
2.3 years ago
Dude • 0

Hi! Everyone, I have a problem when running the WGCNA code in R. The TOM matrix yielded by the function "TOMsimilarityFromExpr " is filled with NA value. Why did this happen?? I would appretiate it if there is anyone could help me with this! Thank you!! 🙏 the code and results are as follows:

>  dissTOM = 1-TOMsimilarityFromExpr(datExpr, power = 8);
> dissTOM[1:6,1:6]
[,1] [,2] [,3] [,4] [,5] [,6]
[1,]    0   NA   NA   NA   NA   NA
[2,]   NA    0   NA   NA   NA   NA
[3,]   NA   NA    0   NA   NA   NA
[4,]   NA   NA   NA    0   NA   NA
[5,]   NA   NA   NA   NA    0   NA
[6,]   NA   NA   NA   NA   NA    0

> TOM = TOMsimilarityFromExpr(datExpr, power = 8)
> TOM[1:6,1:6]
[,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1   NA   NA   NA   NA   NA
[2,]   NA    1   NA   NA   NA   NA
[3,]   NA   NA    1   NA   NA   NA
[4,]   NA   NA   NA    1   NA   NA
[5,]   NA   NA   NA   NA    1   NA
[6,]   NA   NA   NA   NA   NA    1

WGCNA • 2.3k views
0
Entering edit mode

what is the output of dim(datExpr)?

0
Entering edit mode
It's a gene expression set.

> class(datExpr)
[1] "matrix" "array"
> dim(datExpr)
[1]    95 19206

> datExpr[1:4,1:4]
ENSG00000000003 ENSG00000000005 ENSG00000000419 ENSG00000000457
TCGA-2Y-A9H4        39.95787     0.000000000        27.44848        1.323982
TCGA-5C-AAPD        26.10105     0.000000000        24.81812        1.587723
TCGA-BC-A10W        13.57470     0.009521119        27.08928        7.692339
TCGA-BW-A5NP        28.06425     0.022192741        18.36840        4.594120

0
Entering edit mode

does this happen also with the functions adjacency?

Without the datxExpr is difficult to understand what is going on. Would you mind to share the matrix? You can change the name of samples

0
Entering edit mode

Hello! I just tried the function "adjacency". And it seems that it takes a much longer time than it did without adjacency(like hours). Still not so sure if it's gonna work.

>   adja <- adjacency(datExpr,power = 8)
>   dissTOM = 1-TOMsimilarityFromExpr(adja, power = 8);


The datExpr file is included in the link, thank you for your time and attention.

https://github.com/Datapioneer/QUESTION/tree/main

1
Entering edit mode

thanks for the file. TOMsimilarityFromExpr doesn't use the adjacency matrix as input. Use TOMsimilarity

TOM = TOMsimilarity(adjacency)

2
Entering edit mode
2.3 years ago

Apparently the NaN in TOM are introduced because you have 373 genes with too many zero:

datExpr0 <- read_csv("D:/Download/datExpr.csv")
datExpr0<-data.frame(datExpr0, row.names = 1)

gsg = goodSamplesGenes(datExpr0, verbose = 3);
# Flagging genes and samples with too many missing values...
#  ..step 1
#  ..Excluding 373 genes from the calculation due to too many missing samples or zero variance.
#  ..step 2
gsg$allOK # [1] FALSE  Remove offending genes if (!gsg$allOK)
{
# Optionally, print the gene and sample names that were removed:
if (sum(!gsg$goodGenes)>0) printFlush(paste("Removing genes:", paste(names(datExpr0)[!gsg$goodGenes], collapse = ", ")));
if (sum(!gsg$goodSamples)>0) printFlush(paste("Removing samples:", paste(rownames(datExpr0)[!gsg$goodSamples], collapse = ", ")));
# Remove the offending genes and samples from the data:
datExpr = datExpr0[gsg$goodSamples, gsg$goodGenes]
}


Calculate TOM

TOM = TOMsimilarityFromExpr(datExpr, power = 8)

0
Entering edit mode

this is a follow up to the NaN in TOM.

If you are working with a WGCNA version prior to 1.62 (see the change log), the NaN are introduced during the TOM calculation because of completely unconnected nodes. By removing genes with too many zero across your samples, things get slightly better. In conclusion the NaN in TOM seems to be a feature of your expression matrix.