Question: How to remove batch effect in copy number segment mean
gravatar for sugus
18 months ago by
China Pharmaceutical University
sugus50 wrote:

Hi there,

I am wondering how to remove batch effect on segmented_scna data downloading from TCGA PANCANA project. The demo of data format is as following:

Sample  Chromosome  Start   End Num_Probes  Segment_Mean
TCGA-KL-8323-11A-01D-2308-01    1   3218610 104558357   58272   0.0026
TCGA-KL-8323-11A-01D-2308-01    1   104561488   104573702   10  -0.6372
TCGA-KL-8323-11A-01D-2308-01    1   104579877   179610058   27754   0.0041
TCGA-KL-8323-11A-01D-2308-01    1   179621932   179622081   2   -1.6956
TCGA-KL-8323-11A-01D-2308-01    1   179623244   247813706   43114   0.0043
TCGA-KL-8323-11A-01D-2308-01    2   484222  242476062   131310  0.006
TCGA-KL-8323-11A-01D-2308-01    3   2212571 197538677   106379  0.0022
TCGA-KL-8323-11A-01D-2308-01    4   1053934 71781186    38527   0.0048
TCGA-KL-8323-11A-01D-2308-01    4   71781554    71782247    2   -2.2184

I am trying to remove batch effect across tumor type but I am not sure if the segment_mean value could be treated as gene expression and remove batch effect by using ComBat.

If not, could anyone give me some suggestions? Many thanks advanced!

batch effect combat segment cna • 938 views
ADD COMMENTlink modified 17 months ago by Biostar ♦♦ 20 • written 18 months ago by sugus50
gravatar for Kevin Blighe
18 months ago by
Kevin Blighe59k
Kevin Blighe59k wrote:

Why do you believe there is a batch effect?

ADD COMMENTlink written 18 months ago by Kevin Blighe59k

Because it is a Pan Cancer analysis and it may have a batch effect.

ADD REPLYlink written 18 months ago by sugus50

If you are unsure about a batch effect existing in the first place, then you should not blindly assume that there does exist one - that could result in adjusting your data too much to the extent that you eliminate any interesting clinical implications that may exist in the data. Indeed, the copy number profile varies among different cancers, and also the grade / stage of these. How could you distinguish between technical and biological variability in this context?

You could just include 'CancerType' as a covariate in whichever statistical modeling that you are doing, and proceed from there.

ADD REPLYlink written 18 months ago by Kevin Blighe59k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1554 users visited in the last hour