Question: Help with multiple batch effects
0
gravatar for fp89
15 months ago by
fp8920
fp8920 wrote:

Hello, I have an expression matrix of 1208 samples (1095 tumor and 113 normal) downloaded from TCGA. I know there are 3 batch effects: type, plateId and TSS. I've tried to correct for them with Combat but I need a little help with the model.matrix.

batch<-as.data.frame(cbind(samples,plateId,group,TSS),as.is=T)[,-1]

#correct for group
mod.1<- model.matrix(~plateId+TSS, data=batch)
bat.1<- ComBat(dat=dati, batch$group, mod.1, mean.only = TRUE, par.prior=TRUE, prior.plots=FALSE)

## correct for plateId
mod.2<- model.matrix(~group+TSS, data=batch)
bat.2<- ComBat(dat=bat.1, batch$plateId, mod.2, mean.only = TRUE,par.prior=TRUE, prior.plots=FALSE)

## correct for TSS
mod.3<- model.matrix(~group+plateId, data=batch)
bat.3<- ComBat(dat=bat.2, batch$TSS, mod.3, mean.only = TRUE,par.prior=TRUE, prior.plots=FALSE)

There is something wrong. The error message says:

Error in ((dat - t(design %*% B.hat))^2) %*% rep(1/n.array, n.array) : 
  requires numeric/complex matrix/vector arguments

Is there anyone who can help me? I'm a student. Thanks in advance.

combat batch effects sva • 1.2k views
ADD COMMENTlink modified 15 months ago by genomax73k • written 15 months ago by fp8920
1
gravatar for Kevin Blighe
15 months ago by
Kevin Blighe49k
Kevin Blighe49k wrote:

Going by the numbers, looks like the breast cancer TCGA data. I have analysed this data many times and never noticed an effect of type, plateId, or TSS on the expression values. What evidence do you have that suggests they are biasing the counts?

To adjust for batch effects, please avoid the use of ComBat at all costs. You have a couple of options:

Kevin

ADD COMMENTlink modified 15 months ago • written 15 months ago by Kevin Blighe49k

Hi Kevin, thank you. This page mdanderson suggests different batch types.

ADD REPLYlink modified 15 months ago • written 15 months ago by fp8920

Hey, fair enough. It's just not something that I have seen anyone else doing. If you want to adjust for a batch effect, though, first you should check that the effect exist. It may very well not exist, or exist in complex ways that can only be remedied by improving the study design. Batch effects that affect samples unequally are obviously more difficult to model and adjust.

ADD REPLYlink written 15 months ago by Kevin Blighe49k

even i had this issue for rna seq data so i did with svaseq as there is nearly no change in the data even after removing batch effect so what i understand in rna-seq the effect is not much i guess..

ADD REPLYlink written 15 months ago by krushnach80580

Hi, I'm a bit confused. How can I detect the presence of batch effects? With PCA ok but how can I interpret the graph? This is my pca . Red tumor and blue normal samples.

ADD REPLYlink modified 15 months ago • written 15 months ago by fp8920

i would suggest go for unsupervised clustering this figure looks very confusing

ADD REPLYlink written 15 months ago by krushnach80580

When I saw your figure, I said 'Ouch...!' - it does look a bit messy, but it's just due to the labels.

When I look closer, I do not see anything unusual: The 11 (blue) samples are normal tissue, whilst the 01 (red) samples are tumours (assuming your are using 11 and 01 to refer to the TCGA barcodes). So, nothing looks unusual - I see this same distribution for each and every TCGA dataset that I analyse.

A batch effect could be inferred from PCA if there is a large proportion of variation explained on PC1. The proportion of difference could be upward of 90%.

ADD REPLYlink written 15 months ago by Kevin Blighe49k

thank you...These are my clustering for group, plateId and TSS.

ADD REPLYlink written 15 months ago by fp8920

Thanks for sharing and well done! - those are pretty cool dendrograms. Also, apologies if my comment (the 'Ouch...!' part) was interpreted in a negative light. I still don't see any major reason for doing adjustments based on either of these (group, plateid, TSS). The group is different because those are normal tissue samples, so, they are expected to be different. What do you think?

ADD REPLYlink written 15 months ago by Kevin Blighe49k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1002 users visited in the last hour