remove batch effect for multiple tumor data
1
0
Entering edit mode
7.1 years ago
hellocita ▴ 40

Dear all,

I am dealing with a data contains samples with 7 tissues(normal control vs. tumor), and 2 experimental batches. These control and tumor of one tissue is not evenly distributed in two batches. For one specific tissue, there are 3~4 control samples, 3~4 tumor samples. The main aim of this analysis is to find genes differentially expressed in each tissue, and to see if there is intersection of these genes lists from different tissues.

So far I have normalized the data(by total read count), and removed batch effect by setting mean inside one batch to zero. I have also try ComBat to remove batch by using model: batch + labels + batchlabels. There are 72 label types, for example for tissue A there are 2 labels, A_normal and A_tumor. But I got "At least one covariate is confounded with batch" error so give up.

After these, I should choose one tissue(control-tumor) pair to do differential analysis to gat a gene list. The problem is, when I plot the pca plot before doing differential analysis, I found it seems there is still batch effect on that tissue and the biological signal are still confounding with batch. So should I remove batch effect on one specific tissue again?

Thanks in advance:) Any comments will be much appreciated!

tissue1 PC1 and PC2, red for batch A, green for batch B

tissue1 PC1 and PC2, red for tumor, green for normal

tissue1 PC1 and PC2, first plot: red for batch A, green for batch B second plot: red for tumor, green for normal

RNA-Seq batch-effect • 3.1k views
ADD COMMENT
0
Entering edit mode

Hi LucyS,

It might help if you explain what your aim of the experiment is. Is it finding differentially expressed genes? Is it clustering? Machine learning? And what did you do already, what have you tried already? You have normalized to remove batch effect, but how? With what tools? In my opinion your question is too vague and unclear to help you further. If you explain more about your design and goals maybe more people can help you with it.

ADD REPLY
0
Entering edit mode

Hi b.nota. Thanks for your suggestions! I modified my post, hope it's more clear now! My main aim now is to find differentially expressed genes of one tissue, and then see if it intersects with the other tissue's result!

ADD REPLY
1
Entering edit mode
7.1 years ago
Benn 8.3k

Thanks for the more detailed description, LucyS. I would suggest that you use the not normalized (raw) counts, and start with either limma (trend or voom) or edgeR and follow the user's guide. You can also use DEseq2 (but I have myself no experience with that package). In the user's guide they explain how to handle the batch effect, you don't need to normalize for it, but you'll have to put it correctly in the model.

I hope this can help you further.

ADD COMMENT
0
Entering edit mode

thanks for your answer b.nota!

ADD REPLY

Login before adding your answer.

Traffic: 3349 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6