Question

WGCNA eigengenes and sample number reduction

0

Entering edit mode

4.1 years ago

AJ ▴ 20

Hi,

When I use the $moduleEigengene function, I get the following list:

  MEmagenta     MEgreen MElightyellow MEroyalblue    MEyellow  MEturquoise   MElightcyan

 F2_15  -0.02011839  0.10710884  -0.016934090  0.32672467  0.16785401 -0.063904392  0.3517906757
 F2_66  -0.05859031  0.02186919  -0.426863142 -0.26950299 -0.29492186  0.559808442  0.3414648550
 F2_78  -0.03126631  0.10380786   0.019135631 -0.02660238 -0.38110715 -0.318314637 -0.0483743722
 F2_86   0.38744425  0.17416983   0.261450692  0.23142830  0.11424128  0.100530667 -0.1398993598
 F2_88  -0.01556975 -0.16161807  -0.230524538  0.40320534  0.07030380  0.090658505 -0.2378646388

etc. What does this matrix mean? How can I get just the single Eigengene for each module?

Also, how do I reduce the number of samples in the following function:

datExpr = datExpr0[keepSamples, ]

wgcna gene correlation network gene genome • 3.2k views

ADD COMMENT • link 4.1 years ago by AJ ▴ 20

0

Entering edit mode

Could you explain better

I want all the genes in the 1st Principal Component which contribute to the Eigengene of a specific module, say black module.

ADD REPLY • link 4.1 years ago by AJ ▴ 20

0

Entering edit mode

The genes in a given module that mostly contribute the 1st PC are the ones with the highest module memership (kME), the hub genes. To calculate kME you could use the function signedKME. Alternatively, you can use intramodularConnectivity to find out which genes have the highest intramodular connectivity aka hub genes.

ADD REPLY • link 4.1 years ago by andres.firrincieli 3.6k

0

Entering edit mode

Thank you! So as per the documentation of signedKME, the input value (datExpr) would have to be the original expression matrix? So how would this be different from the datME?

ADD REPLY • link 4.1 years ago by AJ ▴ 20

0

Entering edit mode

So how would this be different from the datME?

In signedKME you correlates the expression of a genes with the 1st PC (ME) of every module. The inputs for signedKME are the expression matrix (datExpr) and the data frame containing module eigengenes

ADD REPLY • link 4.1 years ago by andres.firrincieli 3.6k

0

Entering edit mode

data frame containing module eigengenes

As per my understanding, there is no single gene that is a module eigengene, but this is an abstract numerical value. So this dataframe would contain just the values for module eigengenes as columns and the samples as rows? if so, wouldnt there be lesser number of samples in datME than the original expression matrix?

ADD REPLY • link 4.1 years ago by AJ ▴ 20

0

Entering edit mode

In WGCNA genes are clustered according to their expression profiles across all your samples. Therefore, the number of samples in datME is the same of datExpr

ADD REPLY • link 4.1 years ago by andres.firrincieli 3.6k

0

Entering edit mode

below is what part of my datExpr looks like:

MMT00000044 MMT00000046 MMT00000051 MMT00000076 MMT00000080 MMT00000102 MMT00000149
F2_15   -0.058  -0.0589 0.0871  -0.0439 -0.0371 -0.00846    0.12
F2_66   -0.0261 0.0731  0.0419  -0.0126 -0.0206 -0.04   0.149 
F2_78   -0.0282 0.0187  0.0103  0.00166 0.0119  -0.0304 -0.00177
F2_86   -0.0249 -0.0215 -0.0285 0.188   -0.109  0.0127  0.00634
F2_88   -0.0205 -0.0193 -0.0135 -0.00652    -0.0137 0.00311 0.181
F2_143  -0.0217 0.0592  -0.0125 0.102   -0.00317    -0.0422 0.0213
F2_180  0.0321  -0.00677    -0.099  0.0379  -0.0063 0.0277  0.0778
F2_187  0.0112  0.00304 -0.0516 0.0418  0.108   0.0296  -0.0925

below is what part of my datME looks like:

                  MEblack   MEblue       MEbrown
F2_15   -0.351464671    0.360194404 -0.086317519
F2_66   0.110635569 0.379714402 0.097380765
F2_78   0.220924856 -0.172952948    0.16290951
F2_86   0.143071929 -0.141407301    0.02424528
F2_88   -0.116867591    0.306669709 -0.105978168

is this correct?

ADD REPLY • link 4.1 years ago by AJ ▴ 20

0

Entering edit mode

Yes, it is. Does it look wrong for you?

ADD REPLY • link 4.1 years ago by andres.firrincieli 3.6k

0

Entering edit mode

thank you, it looks about right. Is there a threshold for the kME or intramodularConnectivity to select the genes that contribute to the 1st PC?

ADD REPLY • link 4.1 years ago by AJ ▴ 20

0

Entering edit mode

Unfortunately no. I usually detect the interesting modules by correlating the 1st PC with my experimental variables. Once I found the 'interesting' modules, I perform an enrichment analysis, and then look at the hubs genes to verify if the highly connected genes are part of the enriched terms (KEGG pathway of Gene Ontology)

ADD REPLY • link 4.1 years ago by andres.firrincieli 3.6k

0

Entering edit mode

Unfortunately, no.

in that case, on what basis would I select the genes that contribute to my 1st PC?

Also, this is part of the moduleEigenges function and im using it to see if I can use it to find the genes that contribute to the eigengene; was wondering if you could throw more light on it. my datModule would consist of sample as rows and eigengenes belonging to a particular module as columns:

svd1 = svd(datModule, nu = min(n, p, nPC), nv = min(n, p, nPC))

ADD REPLY • link 4.1 years ago by AJ ▴ 20

0

Entering edit mode

Tried the signedKME and looked at all the data/info but keep getting:

Error in colVars(datExpr, na.rm = TRUE) : Argument 'x' must be of type logical, integer or numeric, not 'character'.

Documentation and other sources dont seem to be able to help with this particular error.

ADD REPLY • link 4.1 years ago by AJ ▴ 20

0

Entering edit mode

there is a problem with datExpr. Your matrix is not numeric. Are you following the tutorial?

ADD REPLY • link 4.1 years ago by andres.firrincieli 3.6k

0

Entering edit mode

Yes. My datExpr and datME are what ive posted above-thats what it looks like from the tutorial. do i need to remove the sample names and any other character data and make both datExpr and datME with just numeric?

ADD REPLY • link 4.1 years ago by AJ ▴ 20

0

Entering edit mode

Hi, from the tutorial I was not able to replicate the error so I guess you missed some step. I can not resolve it. You could create a new post

ADD REPLY • link 4.1 years ago by andres.firrincieli 3.6k

0

Entering edit mode

could you please show me what your code and data matrices look like for calculating signedKME and IntramodularConnectivity?

ADD REPLY • link 4.1 years ago by AJ ▴ 20

0

Entering edit mode

I was able to resolve this issue by removing the index/sample names at the beginning of each row which were causing the issue because of the presence of a character.

ADD REPLY • link 4.1 years ago by AJ ▴ 20

0

Entering edit mode

could you explain more on how to correlate eigengene with experimental variables?

ADD REPLY • link 4.1 years ago by AJ ▴ 20

0

Entering edit mode

MEs = moduleEigengenes(datExpr, moduleColors)$eigengenes #calculate module eigengenes (ME)
moduleTraitCor = cor(MEs, datTraits, use = "p") # correlate MEs to the experimental variables
nSamples = nrow(datExpr) # get the number of samples in datExpr
moduleTraitPvalue = corPvalueStudent(moduleTraitCor, nSamples) # get the P-values of the correlation

plot

sizeGrWindow(10,6)
textMatrix = paste(signif(moduleTraitCor, 2), "\n(",signif(moduleTraitPvalue, 1), ")", sep = "");
dim(textMatrix) = dim(moduleTraitCor)
par(mar = c(6, 8.5, 3, 3));
labeledHeatmap(Matrix = moduleTraitCor,
     xLabels = names(datTraits),
     yLabels = names(MEs),
     ySymbols = names(MEs),
     colorLabels = FALSE,
     colors = blueWhiteRed(50),
     textMatrix = textMatrix,
     setStdMargins = FALSE,
     cex.text = 0.3,
     zlim = c(-1,1),
     main = paste("Module-trait relationships"))

datTraits is a binary matrix of your experimental variables. The input matrix used to create datTraits should look like this:

Sample  Condition_1 Condition_2 Condition_3 Condition_4 Condition_5 Condition_6 Condition_7 Condition_8 Condition_9
Sample_1    1   0   0   0   0   0   0   0   0
Sample_2    1   0   0   0   0   0   0   0   0
Sample_3    1   0   0   0   0   0   0   0   0
Sample_4    0   1   0   0   0   0   0   0   0
Sample_5    0   1   0   0   0   0   0   0   0
Sample_6    0   1   0   0   0   0   0   0   0
Sample_7    0   0   1   0   0   0   0   0   0
Sample_8    0   0   1   0   0   0   0   0   0
Sample_9    0   0   1   0   0   0   0   0   0
Sample_10   0   0   0   1   0   0   0   0   0
Sample_11   0   0   0   1   0   0   0   0   0
Sample_12   0   0   0   1   0   0   0   0   0
Sample_13   0   0   0   0   1   0   0   0   0
Sample_14   0   0   0   0   1   0   0   0   0
Sample_15   0   0   0   0   1   0   0   0   0
Sample_16   0   0   0   0   0   1   0   0   0
Sample_17   0   0   0   0   0   1   0   0   0
Sample_18   0   0   0   0   0   1   0   0   0
Sample_19   0   0   0   0   0   0   1   0   0
Sample_20   0   0   0   0   0   0   1   0   0
Sample_21   0   0   0   0   0   0   1   0   0
Sample_22   0   0   0   0   0   0   0   1   0
Sample_23   0   0   0   0   0   0   0   1   0
Sample_24   0   0   0   0   0   0   0   1   0
Sample_25   0   0   0   0   0   0   0   0   1
Sample_26   0   0   0   0   0   0   0   0   1
Sample_27   0   0   0   0   0   0   0   0   1

How to create datTraits

trait = read.table("THP_Trait.txt", sep = "\t", header = TRUE);
Traits = trait[, c(1:10)];
Samples = rownames(datExpr);
traitRows = match(Samples, Traits$Sample);
datTraits = Traits[traitRows, -1];

ADD REPLY • link 4.1 years ago by andres.firrincieli 3.6k

0

Entering edit mode

Thank you so much for your help. Really appreciate it!

ADD REPLY • link 4.1 years ago by AJ ▴ 20

score 0 · Answer 1 · 2020-03-10

Hi AJ,

module eigengenes is the 1st principal component of the expression matrix of the corresponding module and are used to summarize the module expression profile.

How can I get just the single Eigengene for each module?

Could you explain better

Also, how do I reduce the number of samples

You should do that only for the outliers. If you have just one or few samples you could simply use %in%:

datExpr<-datExpr0[!(row.names(datExpr0) %in% c("F2_15", "F2_66")), ]

If you have a lot of outliers just follow the tutorial