Expression of genes in module very different to representative eigenmodule expression (WGCNA)
2
2
Entering edit mode
8.0 years ago

I am using WGCNA for the first time to identify gene co-expression modules across a time course with 40+ samples (RNA-Seq). I've removed lowly expressed genes and focused the analysis on the 15,000 most dynamically expressed genes. The free-scale topology and other indicators (sample clustering, etc) look good and similar to other example datasets. The network is signed.

When I plot the "representative" merged eigengene module expression I get eigengene profiles for each module that agree well with my biological expectations. Nevertheless, when I look closer to investigate specific genes of interest in any particular module their expression pattern is completely discordant and sometimes entirely opposite to that of the "representative" eigengene.

Is this expected and if it is to what level? What could be causing this discordance? What parameters can I modify to make the coexpression clusters "tighter"? Is that necessary?

I suspect one of the factors that might be resulting is this pattern is eigenmodule merging. I used a merging distance threshold of 0.2 to merge eigenmodules but it might be a bit too strict although I'm not sure if there is a better way to choose this threshold. I am in fact expecting large numbers of modules as the conditions I'm comparing are fairly biologically different.

Find the module dendrogram below, as well as the merging cut-off (in red).

Any insights/pointers/suggestions would be greatly appreciated.

Thanks!

rna-seq Gene expression WGCNA network • 10k views
0
Entering edit mode

Hi Liz, and Devon,

I would be interested in hearing how this experience ended up, since I came across a similar problem. I would find it very useful to share experiences with WGCNA, since in the package many of the functions have options that are not very thoroughly documented.

My experience: I constructed a signed network, extracted modules with dynamicTreeCutting, then wanted to merge closely related modules (since typically dynamicTreeCutting tends to give a large number of highly correlated modules).

I proceeded with the default merging approach, but in subsequent analyses on the modules (sanity checks, let's call them) I figured that the merged modules were a bit odd; in particular what I found worrying was a very poor correlation between gene significances and module memberships for given trait associations, which is what I was ultimately interested in (in addition, global in-module expression patterns weren't very consistent, as I think I understand you experienced as well).

Going back and forth with distinct parameters and reading through all function descriptions, I discovered the possibility of merging modules using 1 - abs(correlation) as distance measure, and this seems to give results that are much more robust (at least in terms of expression patterns and gene significance/module membership correlations).

In my mind this makes sense, given that we are working with signed networks, but I never really clarified whether this makes sense only to me or it's truly a valid option.

Marge

0
Entering edit mode

Hi Marge, hopefully Liz will reply with how this worked for her. For my part, I too have needed to monkey around with things a bit to get seemingly meaningful results (and then the modules that correlate with what I'm interested in generally end up being just a superset of DE genes found with DESeq2, so I don't really use WGCNA much anymore). Intuitively, at least, if a module is correlated with a trait, then I too would generally expect to see a rough correlation between module membership and significance. I think that I had previously used 1-abs(correlation) as the distance measure myself, so I guess that made sense to me too (for whatever that's worth!).

1
Entering edit mode

Hi Marge,

As I said in my response to Devon I ended up merging at a much lower threshold (0.1), which effectively resulted in only in three eigengenes merging (tan, brown and salmon in the dendogram above). In fact I get very meaningful results with this approach. I do not have correlations to traits (as this is a time course) so I'm more interested in eigenmodules that are overexpressed in any particular time frame.

In my hands the eigengenes expression pattern correlates beautifully with the biological expectations and shows differential gene enrichments that make sense in our biological framework. What I'm doing now to get a hang on what to focus on in the network is doing a mix of hub gene identification in addition to differential gene expression layered on top of each module, to see if there are any particular genes that we can focus on as drivers of these gene expression changes.

I have also done some testing in a different system where we have way less samples (~30) and the pattern does not make much sense. It seems WGCNA might perform much better the more samples you have, although the gene expression measurements were done in two different platforms so that might also be affecting the results.

Hope this helps,

Liz

0
Entering edit mode

Dear Liz,

Thanks a lot for feedback, it definitely helps!

Marge

0
Entering edit mode

Hi Devon,

Thanks for the feedback. To start with, it's great to hear that I am not the only one that stumbled upon the abs option ;-)

The appeal of WGCNA over differential expression is (at least for me) the possibility (in principle) to extrapolate suggestive regulatory links from the connectivity properties of the modules. Also, the module context seems the perfect context to test pathway enrichments and assign function with a guilt-by-association approach. This in theory: in practice I have to honestly that I've not seen so far very clean and clear results (and it's been a while, both in terms of time and in terms of data).

3
Entering edit mode
8.0 years ago

Some degree of discordance with a module's eigengene is expected. Have a look at the module membership values for these genes, it may well be that they're only borderline members. Regarding tightening up membership in a module, one simple method would be to cut the dendrogram at a lower value (exactly as you suspected). This will, of course, increase your overall module number, but since you're expecting that anyway it's unlikely to be an issue.

BTW, you can also look at some gene correlation heatmaps, which might help you determine a better cutting threshold.

0
Entering edit mode

Thanks for the suggestion! I tried lowering the threshold to 0.1 and, although the eigengene expression is very similar between clusters that were previously merged (as expected) I get much more informative Gene Ontology enrichments for each module and concordant gene expression heatmaps.

I was curious mainly because on Figure 1C (http://www.biomedcentral.com/1752-0509/1/54/figure/F1) of the "Eigengene networks for studying the relationships between co-expression modules" paper the expression seems very tightly related but I guess they are only showing the best example.

2
Entering edit mode

The "representative example" in papers is never really representative :P

Anyway, I wouldn't be surprised if the optimal threshold for cutting varies a bit between experiments and depending on what one wants to do downstream. Anyway, glad you're getting better results now.

0
Entering edit mode
4.7 years ago
BrunoGiotti ▴ 110

Hi there, I realise this post is 3 yrs old. Nonetheless I feel your pain and I can't help trying to help on your ancient problem. Here it is: try using the argument networkType='signed hybrid' within the function blockwiseModules. I encountered the same problem without using it as apparently even in a signed network you may find negative correlated genes within a module.

Cheers!