Question

Getting error when using lmfit (LIMMA) for Microarray data analysis

0

Entering edit mode

3.6 years ago

biotrekker ▴ 100

This is a code i am running:

fit <- lmFit(eset,design)

This is the error I am getting

Error in rowMeans(y$exprs, na.rm = TRUE) : 'x' must be an array of at least two dimensions

This is some relevant information:

> dim(eset)

Features Samples

54675 161

> dim(design)

[1] 161 12

> typeof(eset)

[1] "S4"

> typeof(design)

[1] "double"

Thanks!

R Microarray LIMMA • 2.6k views

ADD COMMENT • link 3.6 years ago by biotrekker ▴ 100

0

Entering edit mode

What is the output of:

str(eset)
eset
dim(exprs(eset))

ADD REPLY • link 3.6 years ago by Kevin Blighe 87k

0

Entering edit mode

    > str(eset)
Formal class 'ExpressionSet' [package "Biobase"] with 7 slots
  ..@ experimentData   :Formal class 'MIAME' [package "Biobase"] with 13 slots
  .. .. ..@ name             : chr ""
  .. .. ..@ lab              : chr ""
  .. .. ..@ contact          : chr ""
  .. .. ..@ title            : chr ""
  .. .. ..@ abstract         : chr ""
  .. .. ..@ url              : chr ""
  .. .. ..@ pubMedIds        : chr ""
  .. .. ..@ samples          : list()
  .. .. ..@ hybridizations   : list()
  .. .. ..@ normControls     : list()
  .. .. ..@ preprocessing    : list()
  .. .. ..@ other            : list()
  .. .. ..@ .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slot
  .. .. .. .. ..@ .Data:List of 2
  .. .. .. .. .. ..$ : int [1:3] 1 0 0
  .. .. .. .. .. ..$ : int [1:3] 1 1 0
  ..@ assayData        :<environment: 0x7fadf5527510> 
  ..@ phenoData        :Formal class 'AnnotatedDataFrame' [package "Biobase"] with 4 slots
  .. .. ..@ varMetadata      :'data.frame': 1 obs. of  2 variables:
  .. .. .. ..$ labelDescription: chr "Index"
  .. .. .. ..$ channel         : Factor w/ 2 levels "exprs","_ALL_": 2
  .. .. ..@ data             :'data.frame': 161 obs. of  1 variable:
  .. .. .. ..$ index: int [1:161] 1 2 3 4 5 6 7 8 9 10 ...
  .. .. ..@ dimLabels        : chr [1:2] "rowNames" "columnNames"
  .. .. ..@ .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slot
  .. .. .. .. ..@ .Data:List of 1
  .. .. .. .. .. ..$ : int [1:3] 1 1 0
  ..@ featureData      :Formal class 'AnnotatedDataFrame' [package "Biobase"] with 4 slots
  .. .. ..@ varMetadata      :'data.frame': 1 obs. of  1 variable:
  .. .. .. ..$ labelDescription: chr NA
  .. .. ..@ data             :'data.frame': 54675 obs. of  1 variable:
  .. .. .. ..$ Symbol: chr [1:54675] NA "RFC2" "HSPA6" "PAX8" ...
  .. .. ..@ dimLabels        : chr [1:2] "featureNames" "featureColumns"
  .. .. ..@ .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slot
  .. .. .. .. ..@ .Data:List of 1
  .. .. .. .. .. ..$ : int [1:3] 1 1 0
  ..@ annotation       : chr "pd.hg.u133.plus.2"
  ..@ protocolData     :Formal class 'AnnotatedDataFrame' [package "Biobase"] with 4 slots
  .. .. ..@ varMetadata      :'data.frame': 2 obs. of  2 variables:
  .. .. .. ..$ labelDescription: chr [1:2] "Names of files used in 'exprs'" "Run dates for files used in 'exprs'"
  .. .. .. ..$ channel         : Factor w/ 2 levels "exprs","_ALL_": 2 2
  .. .. ..@ data             :'data.frame': 161 obs. of  2 variables:
  .. .. .. ..$ exprs: chr [1:161] "/Users/ashaypatel/Desktop/Research/GSE5281/GSM119615.CEL" "/Users/ashaypatel/Desktop/Research/GSE5281/GSM119616.CEL" "/Users/ashaypatel/Desktop/Research/GSE5281/GSM119617.CEL" "/Users/ashaypatel/Desktop/Research/GSE5281/GSM119618.CEL" ...
  .. .. .. ..$ dates: chr [1:161] "01/25/06 15:55:07" "07/28/04 11:26:10" "07/28/04 12:53:35" "07/28/04 12:16:03" ...
  .. .. ..@ dimLabels        : chr [1:2] "rowNames" "columnNames"
  .. .. ..@ .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slot
  .. .. .. .. ..@ .Data:List of 1
  .. .. .. .. .. ..$ : int [1:3] 1 1 0
  ..@ .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slot
  .. .. ..@ .Data:List of 4
  .. .. .. ..$ : int [1:3] 4 0 0
  .. .. .. ..$ : int [1:3] 2 48 0
  .. .. .. ..$ : int [1:3] 1 3 0
  .. .. .. ..$ : int [1:3] 1 0 0

    > eset
ExpressionSet (storageMode: lockedEnvironment)
assayData: 54675 features, 161 samples 
  element names: exprs 
protocolData
  rowNames: GSM119615.CEL GSM119616.CEL ... GSM238963.CEL (161 total)
  varLabels: exprs dates
  varMetadata: labelDescription channel
phenoData
  rowNames: GSM119615.CEL GSM119616.CEL ... GSM238963.CEL (161 total)
  varLabels: index
  varMetadata: labelDescription channel
featureData
  featureNames: 1007_s_at 1053_at ... AFFX-TrpnX-M_at (54675 total)
  fvarLabels: Symbol
  fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
Annotation: pd.hg.u133.plus.2 

    > dim(exprs(eset))
[1] 54675   161

ADD REPLY • link updated 3.6 years ago by Kevin Blighe 87k • written 3.6 years ago by biotrekker ▴ 100

0

Entering edit mode

Thank you. So, it is an ExpressionSet. How was it produced (show all code, please)? How was design produced?

ADD REPLY • link 3.6 years ago by Kevin Blighe 87k

0

Entering edit mode

CELFiles <- list.files(celpath,full.names=TRUE, pattern = "CEL")

RawData <- read.celfiles(CELFiles)

sns <- sampleNames(RawData)

PreProcessedData<- oligo::rma(RawData)

exprs(PreProcessedData)[1:10,1:3]

# Move the normalized expression data
PreProcessedData_exprs <- exprs(PreProcessedData)

# Make the normalized expression data a data frame
PreProcessedData_dataframe <- data.frame(as.ffdf(PreProcessedData_exprs))
write.table(PreProcessedData_dataframe, file="RMA_Preprocessed_Expression_Table.txt", sep="\t")




condition <- factor(samples$Condition, levels=c("Control","AD"))

brainregion <- factor(samples$BrainRegion, levels=c("EC","HC","MTG","PC","SFG","VCX"))

CS <- paste(samples$Condition, samples$BrainRegion, sep=".")
CS
CS <- factor(CS, levels=c("Control.EC","AD.EC",
                          "Control.HC","AD.HC",
                          "Control.MTG","AD.MTG",
                          "Control.PC","AD.PC",
                          "Control.SFG","AD.SFG",
                          "Control.VCX","AD.VCX"))

design <- model.matrix(~CS+0)

colnames(design) <- levels(CS)

contr.matrix <- makeContrasts(
  EC_ADvControl=AD.EC-Control.EC,
  HC_ADvControl=AD.HC-Control.HC,
  MTG_ADvControl=AD.MTG-Control.MTG,
  PC_ADvControl=AD.PC-Control.PC,
  SFG_ADvControl=AD.SFG-Control.SFG,
  VCX_ADvControl=AD.VCX-Control.VCX,
  levels=design)


contr.matrix
# Load the package for probe to transcript or gene annotation
library(annotate)


# Rename your rma normalized processed expression data as eset for differential expression analysis
eset <- PreProcessedData
# Extract the probe IDs from eset, which has all the normalized expression data
ID <- featureNames(eset)
# Apply the probe to gene symbol ID for each probe in eset
SYMBOL <- getSYMBOL(ID,"hgu133plus2.db")
NO_NA_PreProcessedData_dataframe<-PreProcessedData_dataframe[!is.na(SYMBOL),]
# Add the gene symbols back into your eset data fram
fData(eset) <- data.frame(Symbol=SYMBOL)
# fit the linear model
fit <- lmFit(eset,design)
# apply your contrasts

ADD REPLY • link updated 3.6 years ago by Kevin Blighe 87k • written 3.6 years ago by biotrekker ▴ 100

0

Entering edit mode

Thanks, I think that it relates to the annotation part. Can you confirm that this works:?

eset <- PreProcessedData
fit <- lmFit(eset, design)

?

This particular command is likely messing up your ExpressionSet object:

fData(eset) <- data.frame(Symbol=SYMBOL)

ADD REPLY • link 3.6 years ago by Kevin Blighe 87k

0

Entering edit mode

This works:

eset <- PreProcessedData

This does not work:

fit <- lmFit(eset,design)

I tried getting rid of:

fData(eset) <- data.frame(Symbol=SYMBOL)

does not seem to change anything

ADD REPLY • link 3.6 years ago by biotrekker ▴ 100

0

Entering edit mode

Are you sure? If you literally do this, it produces the same error?

eset <- oligo::rma(RawData)
limma::lmFit(eset, design)

If so, then the error is with your design object. Please double check it.

Also output sessionInfo()

ADD REPLY • link 3.6 years ago by Kevin Blighe 87k

0

Entering edit mode

I get the same exact error even with the code you just posted.

    > sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] annotate_1.66.0          XML_3.99-0.5             pd.hg.u133.plus.2_3.12.0
 [4] DBI_1.1.0                RSQLite_2.2.0            hgu133plus2.db_3.2.3    
 [7] org.Hs.eg.db_3.11.4      AnnotationDbi_1.50.3     limma_3.44.3            
[10] oligo_1.52.1             Biostrings_2.56.0        XVector_0.28.0          
[13] IRanges_2.22.2           S4Vectors_0.26.1         ff_4.0.2                
[16] bit_4.0.4                oligoClasses_1.50.4      Biobase_2.48.0          
[19] BiocGenerics_0.34.0     

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.5                  compiler_4.0.2              BiocManager_1.30.10        
 [4] GenomeInfoDb_1.24.2         bitops_1.0-6                iterators_1.0.12           
 [7] tools_4.0.2                 zlibbioc_1.34.0             digest_0.6.25              
[10] memoise_1.1.0               preprocessCore_1.50.0       lattice_0.20-41            
[13] pkgconfig_2.0.3             rlang_0.4.7                 Matrix_1.2-18              
[16] foreach_1.5.0               DelayedArray_0.14.1         rstudioapi_0.11            
[19] GenomeInfoDbData_1.2.3      affxparser_1.60.0           vctrs_0.3.2                
[22] bit64_4.0.2                 grid_4.0.2                  blob_1.2.1                 
[25] codetools_0.2-16            matrixStats_0.56.0          GenomicRanges_1.40.0       
[28] splines_4.0.2               SummarizedExperiment_1.18.2 xtable_1.8-4               
[31] RCurl_1.98-1.2              crayon_1.3.4                affyio_1.58.0

What do you think is wrong my design object, the dimensions seem to be fine and my contrast matrix correctly identifies comparisons?

ADD REPLY • link updated 3.6 years ago by Kevin Blighe 87k • written 3.6 years ago by biotrekker ▴ 100

0

Entering edit mode

Okay, how about:

eset <- oligo::rma(RawData)
exprs(eset)[1:5,1:5]
head(apply(exprs(eset), 1, mean))

ADD REPLY • link 3.6 years ago by Kevin Blighe 87k

0

Entering edit mode

    > eset <- oligo::rma(RawData)
Background correcting... OK
Normalizing... OK
Summarizing... OK

    > exprs(eset)[1:5,1:5]
opening ff /Users/ashaypatel/rma-29241e485da3.ff
          GSM119615.CEL GSM119616.CEL GSM119617.CEL GSM119618.CEL GSM119619.CEL
1007_s_at      8.243370      8.311776      8.841415      8.878884      8.481599
1053_at        3.246700      3.127588      3.070634      3.008858      3.196024
117_at         3.471962      3.412241      3.495739      3.988356      3.222250
121_at         5.869241      6.357483      6.971408      6.888627      6.015305
1255_g_at      3.088038      3.837203      3.167099      2.796989      5.112739

    > head(apply(exprs(eset), 1, mean))
Error in h(simpleError(msg, call)) : 
  error in evaluating the argument 'x' in selecting a method for function 'head': invalid first argument, must be an array

ADD REPLY • link updated 3.6 years ago by Kevin Blighe 87k • written 3.6 years ago by biotrekker ▴ 100

0

Entering edit mode

The head command should really have worked. I am getting the feeling that this is an issue with Mac OS.

Does this not even work:

fit <- lmFit(t(exprs(eset)),design)

ADD REPLY • link 3.6 years ago by Kevin Blighe 87k

0

Entering edit mode

Sorry, because my account is new, it woudn't let me post more. No this does not work. I did recently update my mac os to catalina 10.15.6

ADD REPLY • link 3.6 years ago by biotrekker ▴ 100

0

Entering edit mode

At this point, you will have to provide some test data so that I can test it here. This can simply mean pasting some data here and then showing how the error is reproduced upon using this test data

ADD REPLY • link 3.6 years ago by Kevin Blighe 87k

0

Entering edit mode

So I fixed that error. It seems to be one of data type which is strange. If I use PreProcessedDataframe instead of PreProcessedData is works. However now I need help with annotation. It won't let me use fdata to annotate

here is my code: ```

 #Get Raw Data from cell files
setwd("/Users/ashaypatel/Desktop/Research/GSE5281") #Set Working directory
celpath<-"~/Desktop/Research/GSE5281" #Set path to cel files
phenodata<-read.table("~/Desktop/Research/GSE5281/phenodata.txt", header=TRUE) #returns phenodata as dataframe
CELFiles <- list.files(celpath,full.names=TRUE, pattern = "CEL")
RawData <- read.celfiles(CELFiles)

#Pre-process data with RMA (Robust Multichip Average Algorithm) from the Oligo Package
PreProcessedData<- oligo::rma(RawData)
annotation(PreProcessedData)<-"hgu133plus2.db"
PreProcessedData_exprs <- exprs(PreProcessedData) #Extract the expression data, ff datatype
PreProcessedData_dataframe <- data.frame(as.ffdf(PreProcessedData_exprs)) #convert from ff -> dataframe

#Contrast Matrix

condition <- factor(phenodata$Condition, levels=c("Control","AD"))
brainregion <- factor(phenodata$BrainRegion, levels=c("EC","HC","MTG","PC","SFG","VCX"))
CS <- paste(phenodata$Condition, phenodata$BrainRegion, sep=".")

CS <- factor(CS, levels=c("Control.EC","AD.EC",
                          "Control.HC","AD.HC",
                          "Control.MTG","AD.MTG",
                          "Control.PC","AD.PC",
                          "Control.SFG","AD.SFG",
                          "Control.VCX","AD.VCX"))
design <- model.matrix(~0+CS)
design
colnames(design) <- levels(CS)
contr.matrix <- makeContrasts(
  EC_ADvControl=AD.EC-Control.EC,
  HC_ADvControl=AD.HC-Control.HC,
  MTG_ADvControl=AD.MTG-Control.MTG,
  PC_ADvControl=AD.PC-Control.PC,
  SFG_ADvControl=AD.SFG-Control.SFG,
  VCX_ADvControl=AD.VCX-Control.VCX,
  levels=design)
contr.matrix


#
eset <- PreProcessedData_dataframe
fit <- limma::lmFit(eset, design) #apply lmfit
cfit <- contrasts.fit(fit, contrasts=contr.matrix) #apply contrasts
efit <- limma::eBayes(cfit) #apply ebayes
summary(decideTests(efit))

tfit <- limma::treat(efit,lfc=log2(1.15))
dt <- decideTests(tfit)
summary(dt)

tfit<-data.frame(tfit)
write.csv(tfit,"/Users/ashaypatel/Desktop/Research/GSE5281/DEG_tfit.csv",row.names=TRUE)

```

Sorry I don't know how to insert block code!

Also what I mean is that I have the data in terms of affymetrix probes and not genes, how do I get genes. I know my .db package but the method I used to use using fdata does not work, it relies on PreProcessedData (an expression set) rather than PreProcessedData_dataframe.

Also the documentation says that lmfit should be able to take an expression set so its strange.

Thank you.

ADD REPLY • link 3.6 years ago by biotrekker ▴ 100

1

Entering edit mode

This is the first time that you indicate that this is public GEO data. In that case, please see the solution by my colleague;

ADD REPLY • link 3.6 years ago by Kevin Blighe 87k

1

Entering edit mode

3.6 years ago

ATpoint 81k

Not sure what is happening on your end. I downloaded the data from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE5281 (at the bottom, the entire GSE5281_RAW.tar), unpacked it and then ran this code (conditions are just random because I have no clue what this experiment is about). I am not sure what some of your code lines were doing so I suggest you simply stick with this below, which is also almost 100% what the limma guide suggests:

library(oligo)
library(limma)

CELFiles <- list.files("~/Downloads/GSE5281_RAW/",full.names=TRUE, pattern = "CEL.gz")

RawData <- oligo::read.celfiles(CELFiles)

PreProcessedData<- oligo::rma(RawData)

# just some dummy condition:
condition <- data.frame(Group=c(rep("control", 100), rep("treatment", 61)))

# model.matrix:
design <- model.matrix(~Group, data = condition)

# fit the linear model
fit <- lmFit(PreProcessedData,design)

# empirical Bayes
eB <- eBayes(fit)

# results
topTable(eB, coef = 1)

ADD COMMENT • link 3.6 years ago by ATpoint 81k

0

Entering edit mode

Do you use Mac OS, because even when i run this lmfit is not working. Could this be a Mac issue. Im going to try to download the data again.

EDIT: I still get the same error

setwd("/Users/ashaypatel/Desktop/Research/GSE5281") #Set Working directory
celpath<-"~/Desktop/Research/GSE5281" #Set path to cel files
CELFiles <- list.files(celpath,full.names=TRUE, pattern = "CEL")
RawData <- oligo::read.celfiles(CELFiles)
PreProcessedData<- oligo::rma(RawData)


condition <- data.frame(Group=c(rep("control", 100), rep("treatment", 61)))
design <- model.matrix(~Group, data = condition)
fit <- lmFit(PreProcessedData,design)

I get the error:

> fit <- lmFit(PreProcessedData,design)

Error in rowMeans(y$exprs, na.rm = TRUE) : 'x' must be an array of at least two dimensions

ADD REPLY • link updated 3.6 years ago by ATpoint 81k • written 3.6 years ago by biotrekker ▴ 100

0

Entering edit mode

Same error!?- that is bizarre. You have tried this in a fresh R session? Also avoid RStudio if you can - I never use it for analyses.

I presume that you can obtain the data like this:

library(Biobase)
library(GEOquery)

# load series and platform data from GEO
gset <- getGEO("GSE5281", GSEMatrix =TRUE, getGPL=FALSE)
if (length(gset) > 1) idx <- grep("GPL570", attr(gset, "names")) else idx <- 1
gset <- gset[[idx]]

ADD REPLY • link 3.6 years ago by Kevin Blighe 87k

0

Entering edit mode

What do you use instead of RStudio?, I did try a fresh r ression. I'll try restarting my computer

ADD REPLY • link 3.6 years ago by biotrekker ▴ 100

0

Entering edit mode

RStudio is totally fine, so is Mac. There might be a few situations where RStudio might be undesirable, e.g. in the context of running forked processes, but this is not relevant here.

Yes, the example I showed was done on a Mac (10.14, R 4.0.2), there is something odd going on on your end. I suggest you reinstall the relevant packages, redownload the data and try again.

ADD REPLY • link 3.6 years ago by ATpoint 81k

0

Entering edit mode

I use the standard R environment, on Ubuntu 16.04.

ADD REPLY • link 3.6 years ago by Kevin Blighe 87k

score 1 · Accepted Answer · 2020-09-09

1

Entering edit mode

3.6 years ago

biotrekker ▴ 100

This is solved thank you for your help. I'm sorry I took up so much of both your time. It was as simple as reinstalling the limma package.

Thank you so much for your help, its been essential!

ADD COMMENT • link 3.6 years ago by biotrekker ▴ 100

1

Entering edit mode

Don't worry, I was already suspecting something like this. Happens to all of us sometimes :)

ADD REPLY • link 3.6 years ago by ATpoint 81k