Question: WGCNA and SC-RNA Seq data
gravatar for pennakiza
4 days ago by
pennakiza50 wrote:


I have a dataset of single-cell expression data (at the moment working on CD4 cells only) from 4 patients. Would 4 patients be enough to get any significant results, considering that my sample number is essentially 1200 cells?

Thank you in advance!

wgcna rna-seq sc-rna seq • 97 views
ADD COMMENTlink modified 3 days ago • written 4 days ago by pennakiza50

Hi Kevin,

I have filtered my dataset for low counts, so I have ended up with ~850 genes, and WGCNA runs quite smoothly but the module-trait correlations that I see are quite weak. I was wondering if that is because I am working with so few genes or because all those cells come from only 4 patients.


ADD REPLYlink written 3 days ago by pennakiza50

Could be a few reasons. So, you have 850 genes x ~1200 cells? I'm still not sure that WGCNA is best for scRNA-seq data, and I believe running WGCNA on PC eigenvectors would be better (as I explain in my answer, below). The cellular heterogeneity that comes with scRNA-seq datasets may be what is 'beating' WGCNA in this case, and also the fact that you are effectively dealing with 4 batches (4 samples), or have you run it on the 'integrated' dataset after adjustment for batch?

You are literally the first person that I have ever heard of using WGCNA on scRNA-seq data.

ADD REPLYlink written 3 days ago by Kevin Blighe59k

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized. This comment belongs under Kevin's answer.

SUBMIT ANSWER is for new answers to original question.

ADD REPLYlink modified 3 days ago • written 3 days ago by genomax83k
gravatar for Kevin Blighe
4 days ago by
Kevin Blighe59k
Kevin Blighe59k wrote:

To run WGCNA on such a dataset, you will require a lot of RAM, assuming that you want to run it over the entire transcriptome of each cell. Moreover, I question what exactly it would mean when compared to the output of other methods such as tSNE, UMAP, psuedo-time analysis, etc.

None of us can stop you going ahead with this, but I just question what exactly it would mean. The aforementioned data reduction methods were designed specifically to reduce the computational burden of processing and interpreting scRNA-seq data. Thus, it may make more sense to run WGCNA on a certain number of principal components that account for an appreciable amount of explained variation, like > 80%.


ADD COMMENTlink written 4 days ago by Kevin Blighe59k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1456 users visited in the last hour