Question: How to remove batch effect in copy number segment mean
gravatar for sugus
8 months ago by
China Pharmaceutical University
sugus30 wrote:

Hi there,

I am wondering how to remove batch effect on segmented_scna data downloading from TCGA PANCANA project. The demo of data format is as following:

Sample  Chromosome  Start   End Num_Probes  Segment_Mean
TCGA-KL-8323-11A-01D-2308-01    1   3218610 104558357   58272   0.0026
TCGA-KL-8323-11A-01D-2308-01    1   104561488   104573702   10  -0.6372
TCGA-KL-8323-11A-01D-2308-01    1   104579877   179610058   27754   0.0041
TCGA-KL-8323-11A-01D-2308-01    1   179621932   179622081   2   -1.6956
TCGA-KL-8323-11A-01D-2308-01    1   179623244   247813706   43114   0.0043
TCGA-KL-8323-11A-01D-2308-01    2   484222  242476062   131310  0.006
TCGA-KL-8323-11A-01D-2308-01    3   2212571 197538677   106379  0.0022
TCGA-KL-8323-11A-01D-2308-01    4   1053934 71781186    38527   0.0048
TCGA-KL-8323-11A-01D-2308-01    4   71781554    71782247    2   -2.2184

I am trying to remove batch effect across tumor type but I am not sure if the segment_mean value could be treated as gene expression and remove batch effect by using ComBat.

If not, could anyone give me some suggestions? Many thanks advanced!

batch effect combat segment cna • 316 views
ADD COMMENTlink modified 6 months ago by Biostar ♦♦ 20 • written 8 months ago by sugus30
gravatar for Kevin Blighe
8 months ago by
Kevin Blighe45k
Kevin Blighe45k wrote:

Why do you believe there is a batch effect?

ADD COMMENTlink written 8 months ago by Kevin Blighe45k

Because it is a Pan Cancer analysis and it may have a batch effect.

ADD REPLYlink written 8 months ago by sugus30

If you are unsure about a batch effect existing in the first place, then you should not blindly assume that there does exist one - that could result in adjusting your data too much to the extent that you eliminate any interesting clinical implications that may exist in the data. Indeed, the copy number profile varies among different cancers, and also the grade / stage of these. How could you distinguish between technical and biological variability in this context?

You could just include 'CancerType' as a covariate in whichever statistical modeling that you are doing, and proceed from there.

ADD REPLYlink written 8 months ago by Kevin Blighe45k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 907 users visited in the last hour