Hello,
I have been working on WGCNA, and after following the tutorial, and normalizing using VAT with no filteration except for what is recommended by the old FAQ. Ended up with 18K genes
All of my modules are essentially 1 gene in the middle and 1000+ genes connected to it, could it be that I'm doing something wrong?
What i'm trying to do is see if there are certain genes regulating a group of other genes, I sort of expected alot of interconnected nodes, not a big network connected only to 1 gene.
Here is the code that I used to create the modules
powers = c(c(1:20), seq(from = 22, to=30, by=2))
sft = pickSoftThreshold(normalized_reads_transposed, powerVector = powers, verbose = 5, networkType = 'signed')
power=sft$powerEstimate #22
holder_corr<- cor #We will replace cor function with WCGNA cor function so this will hold the original
cor<- WGCNA::cor
flower_network<-blockwiseModules(normalized_reads_transposed,
power= 22, #Based on soft threshold function above
TOMType = 'signed', #signs of network used enforce adjacency signed networks
mergeCutHeight = '0.25', #Merge thresholds
numericLabels = F, #Colors
maxBlockSize = 20000, #clustering block size, data hads 18K genes so i want them in one block for 1
clustering
verbose = 3, #info output
corType = 'pearson', #Pearson
networkType = 'signed',
saveTOMs = T,
saveTOMFileBase = 'TOM',
randomSeed = 1234) #signed network for biological regulation
That's how a scale free topology network should look like.
However, it is a little bit strange that every module has a scale free topology because RNA-seq data are not topically scale free. Perhaps you should provide the chunck of code used to export the wgcna modules to cytoscape and how many samples did you use for the analysis
Thanks for getting back to me Andres, here is the code I used to export the network. If you think this chunk of code could be useful let me know and I will upload the rest of it.
how did you get
TOM_matrix
andmodule_colors
? Please, show me the code.Also, can you show me the
Scale Free Topology
plot and theMean Connectivity
plot, and tell me on how many samples did you run WGCNA?Okay in order:
I picked the 22 according to the function i mentioned earlier
How many samples 24 samples, as I have 8 timepoints (growth stages) each one has 3 biological replicates.
How did I obtain TOM_matrix and module colors
For the chunk above I saved TOM matrix using saveTOM = T, and then loaded it using RData file that the function outputs, using load(etc..). That's how I obtained the TOM matrix, what was odd about it is every two similar correlations say me with myself, the usual correlation value should be 1 as it is from TOMfromExpression function gives out, but the TOM i loaded from blockmodules has these values set to 0. I think because it views them as distances? A distance between me and myself is 0? I'm not sure.
Obtaining the colors was using the $colors in flower_network (blockwisemodule output) and then I transferred the labels into colors. That's how I obtained the module_colors.
Your modules are already labeled as colors so you do not need
By the way, I don't think this is the problem. Would you mind sharing the
normalized_reads_transposed
file?Ofcourse, I have the normalized reads file is the file that i normalized the raw data using vst, and there is another one that I had pre-normalized for me on log2 scale.
I will link both in a google drive below: https://drive.google.com/drive/folders/1fde_vSnInrISZc-TUm5YsJBk2A1P0C2w?usp=sharing
DEG: normalized log2 differentially expressed genes
Normalized reads: VST normalized genes.
Both come from the same rawfile just a different normalization attempt based on suggestions I got from biostar members that VST is preferred for WGCNA.
Note: Original rawcount was 29K~
Here is what I did to obtain the genes you see in normalized reads
Thanks for the file.
I will try to replicate the analysis during weekend.
Thank you Andres.