Question: remove batch effect for multiple tumor data
0
gravatar for hellocita
2.5 years ago by
hellocita20
hellocita20 wrote:

Dear all, I am dealing with a data contains samples with 7 tissues(normal control vs. tumor), and 2 experimental batches. These control and tumor of one tissue is not evenly distributed in two batches. For one specific tissue, there are 3~4 control samples, 3~4 tumor samples. The main aim of this analysis is to find genes differentially expressed in each tissue, and to see if there is intersection of these genes lists from different tissues.

So far I have normalized the data(by total read count), and removed batch effect by setting mean inside one batch to zero. I have also try ComBat to remove batch by using model: batch + labels + batchlabels. There are 72 label types, for example for tissue A there are 2 labels, A_normal and A_tumor. But I got "At least one covariate is confounded with batch" error so give up.

After these, I should choose one tissue(control-tumor) pair to do differential analysis to gat a gene list. The problem is, when I plot the pca plot before doing differential analysis, I found it seems there is still batch effect on that tissue and the biological signal are still confounding with batch. So should I remove batch effect on one specific tissue again?

Thanks in advance:) Any comments will be much appreciated!

tissue1 PC1 and PC2, red for batch A, green for batch B

tissue1 PC1 and PC2, red for tumor, green for normal

tissue1 PC1 and PC2, first plot: red for batch A, green for batch B second plot: red for tumor, green for normal

ADD COMMENTlink modified 2.5 years ago • written 2.5 years ago by hellocita20

Hi LucyS,

It might help if you explain what your aim of the experiment is. Is it finding differentially expressed genes? Is it clustering? Machine learning? And what did you do already, what have you tried already? You have normalized to remove batch effect, but how? With what tools? In my opinion your question is too vague and unclear to help you further. If you explain more about your design and goals maybe more people can help you with it.

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by Benn7.7k

Hi b.nota. Thanks for your suggestions! I modified my post, hope it's more clear now! My main aim now is to find differentially expressed genes of one tissue, and then see if it intersects with the other tissue's result!

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by hellocita20
1
gravatar for Benn
2.5 years ago by
Benn7.7k
Netherlands
Benn7.7k wrote:

Thanks for the more detailed description, LucyS. I would suggest that you use the not normalized (raw) counts, and start with either limma (trend or voom) or edgeR and follow the user's guide. You can also use DEseq2 (but I have myself no experience with that package). In the user's guide they explain how to handle the batch effect, you don't need to normalize for it, but you'll have to put it correctly in the model.

I hope this can help you further.

ADD COMMENTlink written 2.5 years ago by Benn7.7k

thanks for your answer b.nota!

ADD REPLYlink written 2.5 years ago by hellocita20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1572 users visited in the last hour