Question

WGCNA pickSoftThreshold problem

0

Entering edit mode

16 days ago

ton_of_questions • 0

Hello everyone!

I'm new to WGCNA and currently experiencing some problems...

I want to get a scale-free topography for my RNA-seq data, but the R2 coef varies from -1 to 0.6. I don't understand how to interpret negative R2 values and how to pick softThreshold in this case.

I normalized my data with vst, then picked genes with low CV and high variance, and got ~2.5K genes at the end. I transposed the data so my columns are genes, and rows are the samples.

I did quality check and eliminated one outlier identified on the dendrogram.

Nevertheless my R2 has the following profile signed and . The first plot is for signed and the second is for unsigned.

Could anybody give me some tips on how to make this thing work? Thank you a lot in advance

picksoftthreshold question problem wgcna • 492 views

ADD COMMENT • link updated 13 days ago by LChart 5.0k • written 16 days ago by ton_of_questions • 0

score 1 · Answer 1 · 2025-06-03

1

Entering edit mode

16 days ago

LChart 5.0k

2.5K genes sounds to me like over-filtering; and what was your rationale for removing variable (high CV) genes? My suggestion would be to only filter on detectability, typically a hard threshold on expression value or rank (and possibly remove outlier samples/genes).

ADD COMMENT • link 16 days ago by LChart 5.0k

0

Entering edit mode

Thank you for your reply! Although I don't really understand how to use hard Threshold. I plotted raw correlations and kept the most correlated features, then i do a standard hard threshold:

ht <- pickHardThreshold(filtered, RsquaredCut = 0.85, cutVector = seq(0.1, 0.9, by = 0.05), moreNetworkConcepts = FALSE, removeFirst = FALSE, nBreaks = 10, corFnc = "cor", corOptions = "use = 'p'")

I get this table, then i try sprintf("Optimal hard-power = %d", ht$cutEstimate)

Result --> "Optimal hard-power = NA"

How do I continue the analysis?

ADD REPLY • link 14 days ago by ton_of_questions • 0

0

Entering edit mode

AGain you seem to be over-filtering. Your mean.k and median.k are really low, so I think the number of input genes you used for this is very low. You're seeing negative truncated.R.2 values which suggest something is really very wrong - you don't have anything like a scale free topology. You should take a look at the principal components of your data, because it doesn't look like a "normal" expression dataset, in terms of these metrics. I suspect you're filtering in a non-standard way.

ADD REPLY • link 13 days ago by LChart 5.0k