Question: WGCNA - How to get multiple clusters
0
gravatar for emilaves94iv
15 months ago by
emilaves94iv0 wrote:

I have been following the WGCNA tutorial by Peter Langfelder and Steve Horvath (https://horvath.genetics.ucla.edu/html/CoexpressionNetwork/Rpackages/WGCNA/Tutorials/) but with my own dataset however my final dendogram basically shows ALL the genes in my dataset are clustering together. What should I do or what parameters should affect this?

rna-seq wgcna • 292 views
ADD COMMENTlink modified 15 months ago by Kevin Blighe63k • written 15 months ago by emilaves94iv0
0
gravatar for Kevin Blighe
15 months ago by
Kevin Blighe63k
Kevin Blighe63k wrote:

I am deciding to give an answer, as this will likely attract hits from search engines (based on your question title).

This frequently happens whereby most or all genes are assigned to a single module, and you will have to go back through each step to understand why. We cannot see any of your code, your input, or your output, so, we are not to know precisely where the issue may lie.

Some things at which to look and on which to ponder:

  1. what is your input data? - input data should be normalised and, preferably, transformed to log (natural or base 2) or regularised log, or it should be variance-stabiled or converted to Z-scores
  2. is your input data too 'flat'? - check it in histograms, boxplots, and scatterplots. A person with OCPD (Obsessive Compulsive Personality Disorder.. different from OCD) will want a very neat dataset with all 'lumps' removed'; however, biology never works that way. In the act of making data too 'clean', one may inadvertently eliminate the very signal that one wishes to detect
  3. what is your sample n? - low sample n will be problematic
  4. review the output of all of your WGCNA commands - do not just run the commands blindly from start to finish
  5. ensure that you have chosen the correct soft threshold power
  6. review your tree cut height for merging modules

Hope that these guides help

Note the previous answer, where the user alluded to her input data and sample n as being the source of the problem: C: WGCNA- Large number of genes clustering under one Module

Kevin

ADD COMMENTlink modified 15 months ago • written 15 months ago by Kevin Blighe63k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 757 users visited in the last hour