Query regarding WGCNA module
1
0
Entering edit mode
3 months ago
abhisek061 ▴ 30

What is needed to produce the WGCNA module? Is a series matrix of 15 samples is enough for creating a wgcna module? I failed repeated time to create it please clear it anyone who is experienced. Another question, What type of gene id it's accepts or provide meaningful insight? Thanks!

WGCNA R RNA-seq • 510 views
0
Entering edit mode
3 months ago

I failed repeated time to create it please clear it anyone who is experienced.

Please explain what you have tried and what you mean by 'failed'.

I suggest that you first go through all of the WGCNA tutorials before then proceeding to analyse your own data.

0
Entering edit mode

I tried to create a wgcna module from a series matrix of an RNA-seq experiment which is specifically created for differential gene expression analysis following the wgcna tutorial but I failed initially at the data loading process the datasets were of a 9 samples. I just tried to learn the process instead of getting accurate biological interpretations.

1
Entering edit mode

Okay, cool, what are the errors that you are receiving? Please show code, where appropriate.

0
Entering edit mode

As a beginner, I am following WGCNA tutorial for better understanding. I loaded my data for analysis and now facing a small issue.

workingData = read.csv("final_series_matrix_wgcna.csv")


Now I need to remove the auxiliary data and transpose the expression data for further analysis in WGCNA. What code should I follow for further analysis please help me. Starting few lines of my series matrix file is following-

Geneid     CK1  CK2      CK3       LD1      LD2     LD3     MD1 MD2 MD3
TEA_006622  206070  179461  160381  184100  173045  298960  133917  127452  137454
TEA_015735  914505  594745  872268  758049  656383  758961  675285  467605  744591
TEA_008699  349843  211260  312973  253565  242614  313117  290451  184362  256464
TEA_024572  55202   20844   23774   50355   30379   85808   23103   20746   38982


As a piece of additional information, I want to know from you that my series matrix I created which is based on the highest standard deviation of 4000 genes from 30,000 genes of an experiment. Is there will be any problem with this filtering process? Thanks!

0
Entering edit mode

Now I need to remove the auxiliary data and transpose the expression data for further analysis in WGCNA.

These are just standard data operations. Transposing can be done with t(), while removing 'auxilliary' data can be done via data-frame indexing. If in doubt, please search online for simple things like subsetting data in R and transposing data in R

As a piece of additional information, I want to know from you that my series matrix I created which is based on the highest standard deviation of 4000 genes from 30,000 genes of an experiment. Is there will be any problem with this filtering process? Thanks!

0
Entering edit mode

I am trying to create a WGCNA module but not getting any of these colors in the cluster (it is only black) and heatmap and gene name also, I gave only one series matrix. little code is here. please look into it. Thanks

plot(sampleTree, main = "Sample clustering to detect outliers", sub="", xlab="", cex.lab = 1.5, cex.axis = 1.5, cex.main = 2)
traitColors = numbers2colors(datExpr1, signed = TRUE)
plotDendroAndColors(sampleTree, traitColors, groupLabels = names(datExpr1), main = "Sample dendrogram and trait heatmap")

1
Entering edit mode

Your data may be too 'flat'; thus, WGCNA only identifies a single cluster. Please review all diagnostic plots along the way, such as soft threshold, tree cut height, etc.

0
Entering edit mode

Thanks, I am facing a problem with data filtering in WGCNA tutorial they advising - " Probesets or genes may be filtered by mean expression or variance (or their robust analogs such as median and median absolute deviation, MAD) since low-expressed or non-varying genes usually represent noise. "

I have a question what should I apply ? Mean-variance is similar to calculate the mean or if I get the mean-variance for each gene which genes I should take for getting better results which are low or which are high?

If I do median absolute deviation for the same which value should I take in this case? Thanks!

1
Entering edit mode

Hi, they literally mean mean expression (mean()) or variance (var()). In R, you'd calculate these per row (gene), via:

apply(mat, 1, function(x) mean(x, na.rm = TRUE))
apply(mat, 1, function(x) var(x, na.rm = TRUE))


Then choose a cut-off to use.