WGCNA: scale-free topology fit indexes are negative
12 months ago
Erica

Hi all,

I am trying to perform WGCNA on my RNA-sequencing data (29 samples). First, I obtained the TPM and did a log2 transformation. After filtering lowly expressed genes and genes that show only small changes in expression, I called the network topology analysis function. However, scale-free topology fit indexes are negative. I am really confused, why R^2 can be negative? What should I do to make sure the scale-free topology fit index reach values above 0.8/0.9?

I also checked PCA, it seems that there is no outlier samples.

Below are what I input and the output graph. Any idea will be helpful. Thanks in advance.

## 1. Check distribution

tpm.log2.top5000 <- FSbyMAD(TPM.Log2, cut.type="topk",value=5000)
dim(tpm.log2.top5000)
## 5000    29
library(CancerSubtypes)
data.checkDistribution(tpm.log2.top5000)


## 2. WGCNA: soft-thresholding power

powers = c(c(1:10), seq(from = 12, to=20, by=2))
sft = pickSoftThreshold(tpm.log2.top5000, corFnc = "bicor", corOptions=list(maxPOutliers=0.05),
powerVector = powers, verbose = 5, networkType = "unsigned")

par(mfrow = c(1,2));
cex1 = 0.9;

# Scale-free topology fit index
plot(sft$fitIndices[,1], -sign(sft$fitIndices[,3])*sft$fitIndices[,2], xlab="Soft Threshold (power)",ylab="Scale Free Topology Model Fit,signed R^2",type="n", main = paste("Scale independence")); text(sft$fitIndices[,1], -sign(sft$fitIndices[,3])*sft$fitIndices[,2],
labels=powers,cex=cex1,col="red");
abline(h=0.90,col="green")

# Mean connectivity
plot(sft$fitIndices[,1], sft$fitIndices[,5],
xlab="Soft Threshold (power)",ylab="Mean Connectivity", type="n",
main = paste("Mean connectivity"))
text(sft$fitIndices[,1], sft$fitIndices[,5], labels=powers, cex=cex1,col="red")


## 3. PCA

By the way, I also performed the same procedure on 29 samples with the top 10,000, 15,000 changed genes. The scale-free topology fit indexes are negative as well.

very dumb question but, did you transpose the tpm.log2.top5000 (e.g the rows represent the samples, and the columns represents the genes)?

I did not transpose the tpm.log2.top5000. Yes, it is a very dumb question. But I am new to WGCNA and not familiar with the codes. Thank you : )

we've all been there ;)