Hi!
I'm trying to analyse my RNA-seq data using the WGCNA package for R. The package manual says that you need a minimum of 15 samples, although i've been reading other posts and the FAQs and one should be able to produce an analysis for less than 15 samples.
In my case, I have gene counts data from RNA-seq for two conditions, from which I have three bioreplicates for each group. I wanted to produce a WGCNA for both my WT and experimental samples. I used DESeq2 to produce the DGE, but also i obtained the normalised counts for each condition. At the end, my input data for WGCNA has this format:
gene1 gene2 gene3 … genen
Sample1
Sample2
Sample3
My problem comes when i want to run the function goodSamplesGenes, which returns this error in R:
gsg = goodSamplesGenes(datExpr0, verbose = 3) Flagging genes and samples with too many missing values... ..step 1 Error in goodGenes(datExpr, goodSamples, goodGenes, minFraction = minFraction, : Too few genes with valid expression levels in the required number of samples.
Also, if I try to look at the pickSoftThreshold, the results i obtain for the SFT R2 are really low:
Power SFT.R.sq slope truncated.R.sq mean.k. median.k. max.k.
1 1 3.55e-01 3.06000 0.935 17400 18400 19500
2 2 2.44e-01 1.60000 0.927 14100 15300 16900
3 3 1.74e-01 1.08000 0.921 12200 13300 15300
4 4 1.24e-01 0.80000 0.920 11000 12000 14100
5 5 9.11e-02 0.62800 0.920 10000 10900 13200
6 6 6.64e-02 0.50100 0.919 9320 10100 12500
7 7 4.64e-02 0.39800 0.918 8730 9450 11900
8 8 3.45e-02 0.32800 0.915 8250 8890 11400
9 9 2.45e-02 0.26800 0.918 7840 8420 11000
10 10 1.71e-02 0.21900 0.919 7490 8000 10600
11 12 7.63e-03 0.14000 0.914 6900 7330 9900
12 14 2.77e-03 0.08140 0.919 6440 6790 9360
13 16 4.23e-04 0.03090 0.913 6060 6350 8900
14 18 3.52e-06 -0.00277 0.914 5740 5980 8500
15 20 6.56e-04 -0.03730 0.916 5470 5670 8160
Am I just being too ambitious trying to use WGCNA with my number of samples, or am I missing some kind of data processing that I should include for my analysis?
Many thanks
Dan
May be you could just try the clustering methods and probably achieve what you are trying to do with WGCNA as you have lower number of samples.