Question: WGCNA negative values
0
gravatar for reara
6 weeks ago by
reara20
reara20 wrote:

I am getting a "list" error when I try to input my values in WGCNA. I have FPKM values in my expression matrix some of which are negative (with a negative sign in front). Unsure of how to resolve this issue. Any help is appreciated.

ADD COMMENTlink modified 6 weeks ago by Kevin Blighe71k • written 6 weeks ago by reara20
1
gravatar for Kevin Blighe
6 weeks ago by
Kevin Blighe71k
Republic of Ireland
Kevin Blighe71k wrote:

A negative FPKM value makes no sense, so, there is something wrong with your data processing steps prior to WGCNA. Perhaps you have [erroneously] tried to run ComBat on your FPKM data to correct for one or more batch effects that you perceive exist(s) in your data?

Kevin

ADD COMMENTlink modified 6 weeks ago • written 6 weeks ago by Kevin Blighe71k

Sorry just to clarify-these are post normalization values. Do i take it that WGCNA cannot handle negative values?

ADD REPLYlink written 6 weeks ago by reara20

It can handle negative values, but, what I was implying was that there is perhaps something else 'wrong' with your data based on the fact that a negative FPKM expression value makes no sense, unless these are logged FPKM expression levels.

The error itself also alludes to a data structure issue. Can you show your code that produces that error, and also the output of the str() function run on your input expression matrix?

ADD REPLYlink written 6 weeks ago by Kevin Blighe71k

Yes, I see what you mean. Here is what str() gave:

data.frame':    2612 obs. of  3228 variables:
 $ V1   : Factor w/ 2445 levels " 0.000302164",..: 2445 2257 2068 2183 2204 1494 148 1261 290 2166 ...
  ..- attr(*, "names")= chr  "X" "X10010J" "X10025W" "X10052Z" ...
 $ V2   : Factor w/ 1765 levels " 0.002960904",..: 1765 750 275 1605 1298 1081 1241 455 620 1241 ...

(However even the default WGCNA data has MMT00000044 so im not sure what is happening with my data)

Here is a photo of what my expression matrix actually looks like-

https://ibb.co/v3KdP2K

ADD REPLYlink modified 6 weeks ago by Kevin Blighe71k • written 6 weeks ago by reara20

Well, there may be the problem. Your data frame is encoded categorically, i.e., as factors. You need to convert it to a data matrix or to keep it as a data frame but with everything encoded numerically.

ADD REPLYlink written 6 weeks ago by Kevin Blighe71k

So i tried to convert it to a numeric dataframe but this is what i got-

datExpr0 = as.data.frame(t(femData))

datExpr0 = data.matrix(datExpr0)

There were 50 or more warnings (use warnings() to see the first 50)

str(datExpr0)

num [1:2612, 1:3228] NA -0.482 -0.188 -0.371 -0.401 ...

  • attr(*, "dimnames")=List of 2

..$ : chr [1:2612] "X" "X10010J" "X10025W" "X10052Z" ...

..$ : chr [1:3228] "V1" "V2" "V3" "V4" ...

also how do i get rid of the V1, V2 that R automatically seems to insert when making a dataframe?

ADD REPLYlink modified 25 days ago • written 25 days ago by reara20

Evidently, your object, femData, contains data that is non-numerical. You need to remove these.

Can you please confirm that you have first completed the WGCNA tutorial? Which part of the tutorial is this?

ADD REPLYlink modified 25 days ago • written 25 days ago by Kevin Blighe71k

Yes, I have completed the tutorial. This is the part im having trouble with:

datExpr0 = as.data.frame(t(femData[, -c(1:8)]));

names(datExpr0) = femData$substanceBXH;

rownames(datExpr0) = names(femData)[-c(1:8)];

ADD REPLYlink modified 25 days ago • written 25 days ago by reara20

I see, but, if you look at the tutorial code, columns 1 to 8 are being removed via -c(1:8). These are likely non-numerical columns.

ADD REPLYlink written 25 days ago by Kevin Blighe71k

Yes the issue appears to be that headers (V1, V2...) which get added when you make a dataframe are causing the issue as they then make the gene IDs a non-numeric component of the df itself. I was just trying out different ways to do this, but it appears the tutorial is the only/best way to subvert this issue.

ADD REPLYlink written 24 days ago by reara20

I have a pheno/triat file with only the fields i need, but still when I run the datTraits im getting NA values in my table-could this be a similar issue as above?

>traitData = read.csv("pheno_tmm_lc_cbc_subset_freeze3_reqd_fields.csv");
dim(traitData)
names(traitData)

--remove columns that hold information we do not need.

>allTraits = traitData;
allTraits = allTraits[,];
dim(allTraits)
names(allTraits)

--Form a data frame analogous to expression data that will hold the clinical traits.

>femaleSamples = rownames(datExpr0);
traitRows = match(femaleSamples, allTraits$sid);
datTraits = allTraits[traitRows, -1];
rownames(datTraits) = allTraits[traitRows, 1];
ADD REPLYlink modified 23 days ago by Kevin Blighe71k • written 23 days ago by reara20

Without seeing input and output for each step, I am limited in what I can do. All that I can say is to be sure that your input data has the same format as that used by the tutorial, i.e., to avoid issues elsewhere throughout the tutorial itself

ADD REPLYlink written 23 days ago by Kevin Blighe71k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2320 users visited in the last hour
_