Question

WGCNA number of modules issue

0

Entering edit mode

13 months ago

1215045934 ▴ 80

Hi all!

I am working on RNAseq data from a 3 treatment: condition 1, condition 2, and condition 3.

There are 70k genes in my data. I would like to use WGCNA to find genes expressed correlated to conditions, in particular, strongly positive correlated to condition 1, no/low correlation to condition 2, and strongly negative correlated to condition 3. Then I would like to see what functions those genes have by kegg pathway or GO enrichment.

I applied WGCNA on the genes that have a total count more than 10 (60k genes). I got about 80 modules and the Dendrogram looks messy. I am new to WGCNA and I was wondering if this is correct. I did find some modules with the pattern described above with really low p-values.

Should I worry about the number of modules? If so, should I change any parameters to improve it?
I saw people using top 10000 most variant genes. I did try that and get 8 modules, still finding with 2k genes fit the pattern.
Is 2000 genes in a module too many for gene enrichment?
What is the difference for the module fit the pattern, and DEG upregulated in condition one in condition 1 vs 2 comparison.

Thanks a lot!

networks gene WGCNA correlation • 1.5k views

ADD COMMENT • link 13 months ago by 1215045934 ▴ 80

0

Entering edit mode

How many samples do you have? You should also share the code you ran for each critical step.

ADD REPLY • link 13 months ago by rpolicastro 13k

0

Entering edit mode

26 samples, softpower = 6 selected by the lowest number above the 0.9 line

#specify network type
softPower = 6
temp_cor <- cor       
cor <- WGCNA::cor

#-----------------Block-wise network construction and module detection
netwk = blockwiseModules(datExpr, maxBlockSize = 8000,
                         power = softPower, TOMType = "signed", minModuleSize = 30,
                         reassignThreshold = 0, mergeCutHeight = 0.30,
                         numericLabels = TRUE,
                         saveTOMs = TRUE,
                         saveTOMFileBase = "FS-M-allgenes",
                         verbose = 3)

Thanks!

ADD REPLY • link 13 months ago by 1215045934 ▴ 80

0

Entering edit mode

there is nothing wrong with that chunck of code. Therefore, the high number of modules is likely given by the strategy used to filter out low count genes/transcripts. My suggestion is to work with the top N most variable genes.

One more thing, a softpower = 6 for a signed network is actually pretty low. When you run pickSoftThreshold you must specify networkType = "signed"

ADD REPLY • link 13 months ago by andres.firrincieli 3.6k

0

Entering edit mode

Thanks! Yeah I didn't specify networkType = "signed".

ADD REPLY • link 13 months ago by 1215045934 ▴ 80