Question: Batch Effect normalization with Combat generates error
0
gravatar for madkitty
3.9 years ago by
madkitty580
Canada
madkitty580 wrote:

I have 6 batches (10 samples in each) from 6 arrays that we want to normalize for batch effect with Combat. When I run the script below, I encounter an error, any idea how to fix this? 

I'm editing the current code with all modification and current error message that I get. 

> library(sva)

> dat = read.csv("Combat_matrix_input.csv");
> sif = read.csv("sif.csv");
> batch = sif$Batch

> head(batch)
[1] Batch1 Batch1 Batch1 Batch1 Batch1 Batch1
Levels: Batch1 Batch2 Batch3 Batch4 Batch5 Batch6

> modcombat = model.matrix(~1, data=dat)
> dat <- as.matrix(dat)
> combat_edata = ComBat(dat=dat, batch=batch, mod=modcombat, par.prior=TRUE, prior.plots=FALSE)
Found 6 batches
Error in cbind(batchmod, mod) : 
  number of rows of matrices must match (see arg 2)

# Here I'm trying with batch as.vector
> batch = as.vector(sif$Batch)
> combat_edata = ComBat(dat=dat, batch=batch, mod=modcombat, par.prior=TRUE, prior.plots=FALSE)
Found 6 batches
Error in cbind(batchmod, mod) : 
  number of rows of matrices must match (see arg 2)

 

 

 

Here I display the output of column Batch in sif

# I tried both batch as.vector as you guys recommended
> batch = as.vector(sif$Batch)
> batch = sif$Batch

# Both batch as.vector or regular variable return the error in combat before, and return the same head()
> head (sif$Batch)
[1] Batch1 Batch1 Batch1 Batch1 Batch1 Batch1
Levels: Batch1 Batch2 Batch3 Batch4 Batch5 Batch6

 

ADD COMMENTlink modified 3.9 years ago • written 3.9 years ago by madkitty580

can you show us the output of

head(sif$Batch)
ADD REPLYlink written 3.9 years ago by andrew.j.skelton735.6k

@andrew.j.skelton73 : I updated the original post with the output of head(sif$Batch). Thanks :)

ADD REPLYlink written 3.9 years ago by madkitty580

Will

as.vector(sif$Batch)

help?

ADD REPLYlink written 3.9 years ago by Sam2.3k

Just found my script. This was what I did with ComBat

I was trying to merge RNA Seq data with microarray. mydata is my microarray data and vsd is my RNA Seq data whereas data.input is the merge data frame

pheno = data.frame(row.names=row.names(pData(mydata)), timepoint= c(rep("GD9.5",5), rep("GD11.5",4),rep("GD13.5",6), "GD9.5"), platform="1", condition="Control")
phenoRNA = data.frame(row.names=colnames(assay(vsd)), condition=c(rep("Case", 5), rep("Control", 5)), platform="2", timepoint="GD9.5")
phenotype = rbind(pheno, phenoRNA)
mod = model.matrix(~as.factor(condition), data=phenotype )
mod0 = model.matrix(~1,data=phenotype )
combat_edata = ComBat(dat=data.input, batch=batch, mod=mod, numCovs=NULL, par.prior=TRUE, prior.plots=FALSE)
pValuesComBat = f.pvalue(combat_edata,mod,mod0)
ADD REPLYlink written 3.9 years ago by Sam2.3k

Thanks Sam, though in our case we only have genes in rows and samples in columns. The sample info file contains only array, sample_name and batch. Without any condition, how should I write mod? I don't understand what should go under ~as.factor() 

ADD REPLYlink written 3.9 years ago by madkitty580

@Sam :  I tried that

batch = as.vector(sif$Batch)

Unfortunately the result is the same.. 

 

ADD REPLYlink written 3.9 years ago by madkitty580
1

According to page 6 of the user manual

batch = pheno$batch

modcombat = model.matrix(~1, data=pheno) 

combat_edata = ComBat(dat=edata, batch=batch, mod=modcombat, par.prior=TRUE, prior.plots=FALSE) 

The deviation of this code and yours is that instead of data=pheno, you use data=dat. If you look at my example script, you should note that the pheno is the phenotype matrix. What I suspect is the data in the mod should contain the phenotype and the batch instead of the count table. try and see if 

modcombat = model.matrix(~1, data=sif)

solves the problem? That's just my guess though

 

ADD REPLYlink written 3.9 years ago by Sam2.3k

Hi Sam, Thanks again for your answer. When replacing for data=sif in modcombat, I get another type of error (see below). The manual isn't useful at all .. I'm really desperate to find a solution :( 

 

Found 6 batches
Adjusting for 0 covariate(s) or covariate level(s)
Standardizing Data across genes
Error in solve(t(design) %*% design) %*% t(design) %*% t(as.matrix(dat)) : 
  requires numeric/complex matrix/vector arguments

 

ADD REPLYlink written 3.9 years ago by madkitty580

Can you show us how your sif looks like?

ADD REPLYlink written 3.9 years ago by Sam2.3k

Actually, I gave it another try, and indeed your suggestions worked well. Just for those wondering, the manual isn't clear, MOD should contain SAMPLE INFO FILE and COVARIATE information as a data.frame. Thanks Sam!

ADD REPLYlink written 3.9 years ago by madkitty580

Good to know that it works

ADD REPLYlink written 3.9 years ago by Sam2.3k
0
gravatar for andrew.j.skelton73
3.9 years ago by
London
andrew.j.skelton735.6k wrote:

I'm pretty sure that your variable should be numeric to indicate batches. So change 

sif$Batch

To numeric

ADD COMMENTlink written 3.9 years ago by andrew.j.skelton735.6k

Thanks, I've tried to define batch = as.numeric(sif$batch) It really didn't work .. 

ADD REPLYlink written 3.9 years ago by madkitty580
1

ok, change the 'mod' parameter in your combat call to 'NULL' - So, 

combat_edata = ComBat(dat=dat, batch=batch, mod=modcombat, par.prior=TRUE, prior.plots=FALSE)

is changed to....

combat_edata = ComBat(dat=dat, batch=batch, mod=NULL, par.prior=TRUE, prior.plots=FALSE)
ADD REPLYlink written 3.9 years ago by andrew.j.skelton735.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1147 users visited in the last hour