Question

Wgcna: Heterogeneity Within A Module Containing Co-Expressed Genes

1

Entering edit mode

11.3 years ago

mohan ▴ 10

Hello dear people,

I have been working with this statistical program WGCNA for building coexpression network for a large microarray dataset. I have completed the initial network building procedure. I am trying to characterize the genes within the modules for enrichment in biological processes. (http://labs.genetics.ucla.edu/horvath/CoexpressionNetwork/Rpackages/WGCNA/)

I retreived all the genes for a module and initially tried to build heatmap to check visually whether coexpression can be appreciated, I also plotted the eigengene expression values as a quantification. Here lies the issue.

Coexpression is observed for about 65 percent of the genes.But there are the rest that behave in just the opposite fashion across samples (when plotted for their expression values) but still considered as coexpressing by the program. Considering that coexpression is based on correlations, I expected that genes behaving in opposite fashion would fall in different modules rather than the same. I have also gone through literature where I do not see this problem happening. Is there something I am missing here? Would anyone from your experience have any clue? I have tried this across two different datasets and spot the same problem.

If I may add, I have mostly been using default parameters set by the authors except for increasing the softpowerthreshold to 9 (from 6) , and lowering the mergecutheight to 0.1 from the default 0.25.

Thanks a lot for your kind replies.

best regards, mohan

• 4.4k views

ADD COMMENT • link updated 22 months ago by Ram 43k • written 11.3 years ago by mohan ▴ 10

0

Entering edit mode

"....there are the rest that behave in just the opposite fashion across samples (when plotted for their expression values) but still considered as coexpressing by the program..." - when you say they behave in the opposite fashion - do you mean that they are negatively correlated? I am not clear on what you mean.

ADD REPLY • link 11.3 years ago by Darren J. Fitzpatrick ★ 1.1k

0

Entering edit mode

many thanks. With my limited knowledge of statistics, my naive thoughts are as follows.

WGCNA is based on pearsons correlations. So the genes are either negatively correlated or positively correlated or not correlated at all. In this particular experiment we analyse gene expression in a timecourse of 0-4 days of a growing tissue. I expected to segregate the best correlated genes into modules. In the sense, over the time course, the coexpressed genes would be relatively more expressed or less expressed in the same fashion. However, after running WGCNA when I check the module for the genes and then plot a heatmap, I find they are not so. Around 65 percent of the genes relatively go up and come down as expected per timepoint per sample. But there are remaining 30% that are have expression levels just the opposite. I expect that these would be negatively correlated and therefore be in a different module. Why are they a mixed population? I hope I am clear. Thanks for the replies.

ADD REPLY • link 11.3 years ago by mohan ▴ 10

Ram · Answer 1 · 2014-07-14

2

Entering edit mode

9.8 years ago

antass ▴ 20

This is long overdue, but maybe it will help someone else with a similar problem.

WGCNA defines its adjacency matrix as

aij = |cor(xi, xj)|^beta

The absolute value will get rid of any directionality of correlation between two gene expression profiles. That explains why genes going both directions would be included in the same module - the adjacency between two positively correlated genes may be the same as that of two negatively correlated ones, thus they would all be equally close and end up in the same bag.

ADD COMMENT • link updated 2.4 years ago by Ram 43k • written 9.8 years ago by antass ▴ 20

0

Entering edit mode

Hi Antass,

Thanks for your response, it was very helpful --

Is there a way to use WGCNA and not include these "opposites" in the same module? I tried simply removing the abs() in the command from the tutorial, but then I just get an error message "some entries are not between0and1". If it is not possible to do this, is there some downstream commands I could use to separate the genes within a module that have opposite expression profiles? Thanks!

ADD REPLY • link updated 22 months ago by Ram 43k • written 9.0 years ago by whatle • 0