I am getting a "list" error when I try to input my values in WGCNA. I have FPKM values in my expression matrix some of which are negative (with a negative sign in front). Unsure of how to resolve this issue. Any help is appreciated.
A negative FPKM value makes no sense, so, there is something wrong with your data processing steps prior to WGCNA. Perhaps you have [erroneously] tried to run ComBat on your FPKM data to correct for one or more batch effects that you perceive exist(s) in your data?
Kevin
Sorry just to clarify-these are post normalization values. Do i take it that WGCNA cannot handle negative values?
It can handle negative values, but, what I was implying was that there is perhaps something else 'wrong' with your data based on the fact that a negative FPKM expression value makes no sense, unless these are logged FPKM expression levels.
The error itself also alludes to a data structure issue. Can you show your code that produces that error, and also the output of the str()
function run on your input expression matrix?
Yes, I see what you mean. Here is what str() gave:
data.frame': 2612 obs. of 3228 variables:
$ V1 : Factor w/ 2445 levels " 0.000302164",..: 2445 2257 2068 2183 2204 1494 148 1261 290 2166 ...
..- attr(*, "names")= chr "X" "X10010J" "X10025W" "X10052Z" ...
$ V2 : Factor w/ 1765 levels " 0.002960904",..: 1765 750 275 1605 1298 1081 1241 455 620 1241 ...
(However even the default WGCNA data has MMT00000044 so im not sure what is happening with my data)
Here is a photo of what my expression matrix actually looks like-
Well, there may be the problem. Your data frame is encoded categorically, i.e., as factors. You need to convert it to a data matrix or to keep it as a data frame but with everything encoded numerically.
So i tried to convert it to a numeric dataframe but this is what i got-
datExpr0 = as.data.frame(t(femData))
datExpr0 = data.matrix(datExpr0)
There were 50 or more warnings (use warnings() to see the first 50)
str(datExpr0)
num [1:2612, 1:3228] NA -0.482 -0.188 -0.371 -0.401 ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:2612] "X" "X10010J" "X10025W" "X10052Z" ...
..$ : chr [1:3228] "V1" "V2" "V3" "V4" ...
also how do i get rid of the V1, V2 that R automatically seems to insert when making a dataframe?
Evidently, your object, femData, contains data that is non-numerical. You need to remove these.
Can you please confirm that you have first completed the WGCNA tutorial? Which part of the tutorial is this?
Yes, I have completed the tutorial. This is the part im having trouble with:
datExpr0 = as.data.frame(t(femData[, -c(1:8)]));
names(datExpr0) = femData$substanceBXH;
rownames(datExpr0) = names(femData)[-c(1:8)];
I see, but, if you look at the tutorial code, columns 1 to 8 are being removed via -c(1:8)
. These are likely non-numerical columns.
Yes the issue appears to be that headers (V1, V2...) which get added when you make a dataframe are causing the issue as they then make the gene IDs a non-numeric component of the df itself. I was just trying out different ways to do this, but it appears the tutorial is the only/best way to subvert this issue.
I have a pheno/triat file with only the fields i need, but still when I run the datTraits im getting NA values in my table-could this be a similar issue as above?
>traitData = read.csv("pheno_tmm_lc_cbc_subset_freeze3_reqd_fields.csv");
dim(traitData)
names(traitData)
--remove columns that hold information we do not need.
>allTraits = traitData;
allTraits = allTraits[,];
dim(allTraits)
names(allTraits)
--Form a data frame analogous to expression data that will hold the clinical traits.
>femaleSamples = rownames(datExpr0);
traitRows = match(femaleSamples, allTraits$sid);
datTraits = allTraits[traitRows, -1];
rownames(datTraits) = allTraits[traitRows, 1];