Question: Tale as old as time: lmFit dimension issues
1
gravatar for leonmcswain
4 months ago by
leonmcswain10
leonmcswain10 wrote:

I have seen some other posts on here but after going through them Im not finding a solution. I am using limma to do DE analysis on 763 patient microarrays, 4 groups total. My expression object is a matrix with rownames as genes and colnames as patient ID.

When I try to run the code I get the error:

> fit <- lmFit(as.numeric(Merge_DF_Avg2), design)
Error in lmFit(as.numeric(Merge_DF_Avg2), design) : 
  row dimension of design doesn't match column dimension of data object

The dimensions seem correct:

> dim(design)
[1] 763   4

> dim(Merge_DF_Avg2)
[1] 20341   763

> class(Merge_DF_Avg2)
[1] "matrix" "array"

Here is the code:

#Data wrangle for limma
>Merge_DF_Avg <-readRDS("C:/Users/12298/Desktop/Data_Analytics/Taylor_2017/Finalized_Expression_Array_avgs.rds")
>Merge_DF_Avg2 <- Merge_DF_Avg %>% na.omit() %>% pivot_wider(values_from=avg, names_from=Gene_Name) %>% t() %>% janitor::row_to_names(row_number = 1)
>Patient_Cat <- as.vector(as.numeric(Merge_DF_Avg2[1,]))
>Merge_DF_Avg2 <- Merge_DF_Avg2[-c(1,2),] #Taking out patient cat and unidentified gene rows

#limma design
>design <- model.matrix(~ 0 + factor(Patient_Cat))
>colnames(design) <- c("SHH", "Group3", "Group4", "WNT")
>fit <- lmFit(as.numeric(Merge_DF_Avg2), design)

I am using the example code provided by the limma package pdf to guide me.

R • 240 views
ADD COMMENTlink modified 4 months ago by dariober11k • written 4 months ago by leonmcswain10
1

In general, please make it a habit in the future to provide the data using dput (at least a small chunk of it) to allow reproduction of the problem.

ADD REPLYlink written 4 months ago by ATpoint46k

I already have an answer to the post but I would like to know more about this. When I post should I just copy and paste the output from dput? For this matrix its a bit chaotic since there are 763 columns. The output gets cut off in the console because its so long.

ADD REPLYlink written 4 months ago by leonmcswain10
1

A small subset of the data is a good idea so one can actually run the code you provide. Yes, just copy paste the dput, one can then easily paste this back into R.

ADD REPLYlink written 4 months ago by ATpoint46k
2
gravatar for dariober
4 months ago by
dariober11k
WCIP | Glasgow | UK
dariober11k wrote:

I think as.numeric(Merge_DF_Avg2) in lmFit converts a character matrix to a vector. To convert the numeric vector back to matrix you could do (check it's ok!):

matrix(as.numeric(as.matrix(Merge_DF_Avg2)), ncol= ncol(Merge_DF_Avg2))
ADD COMMENTlink written 4 months ago by dariober11k

This worked perfectly! Thank you! You were also correct about the vector it doesn't matter if I state that because the vector goes into design matrix not directly into lmFit. -Leon

ADD REPLYlink written 4 months ago by leonmcswain10
3
gravatar for Gordon Smyth
4 months ago by
Gordon Smyth2.3k
Australia
Gordon Smyth2.3k wrote:

Edit

I'm rewriting my answer from yesterday because I've had a closer look at your data wrangling code. Initially, I was unclear why you would not just run lmFit in the usual way with:

fit <- lmFit(Merge_DF_Avg2, design)

On closer look, I see now that your Merge_DF_Avg2 object is almost certainly a data.frame where every column is a character vector and that presumably is why you're running as.numeric. My original answer was to point out that as.numeric returns a dimensionless vector, which obviously causes the dimension error.

If it was me, I would revisit your earlier code by which the expression data was stored and wrangled so that conversion to character didn't occur in the first place. That should be easy to avoid, for example by not making the column names the first row of the data.frame. Character strings are a really poor way to store numeric expression values. But it's up to you. The code from dariober will work, but to me it's repairing a problem that shouldn't have been introduced in the first place.

ADD COMMENTlink modified 4 months ago • written 4 months ago by Gordon Smyth2.3k
1

I think the as.vector bit works as it should. The problem should be with as.numeric(Merge_DF_Avg2)

ADD REPLYlink written 4 months ago by dariober11k

dariober was correct about this.

ADD REPLYlink written 4 months ago by leonmcswain10
1

OK, my original answer was confusing because I wrote as.vector where I meant to write as.numeric. It was as.numeric(Merge_DF_Avg2) that I was refering to.

ADD REPLYlink modified 4 months ago • written 4 months ago by Gordon Smyth2.3k

I am a Cancer Biologist by training so I still have a bit of work to do regarding R basics (re not introducing these issues to begin with)

ADD REPLYlink written 4 months ago by leonmcswain10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1966 users visited in the last hour
_