Batch effect in the single microarray dataset
1
0
Entering edit mode
4 months ago
seta ★ 1.6k

Dear all,

I downloaded the series matrix file of a single microarray dataset (breast cancer), data were normalized and log-transformed, is box plot of data. I collapsed multiple probes of the same gene as the single gene using limma::avereps. the box plot was slightly changed after collapsing data as you can see . is this change a matter in your professional view? I used collapsed data to generate a PCA plot based on cancer subtype as you can see . Could you please let me know if you see any signs of a batch effect in the PCA plot, especially for those samples located at the right corner of the plot (basal subtype)? if yes, please kindly let me know how I can define a batch variable using this information and correct the batch during the analysis?

Many thanks!

PCA effect expression gene batch • 379 views
1
Entering edit mode
4 months ago
shiyang_bio ▴ 150

Hi, There is no problem in collapsing probes by limma. From your PCA plot I cannot get information of batch effect. You should color the dot using other variable, such as batch number, but not BC subtype. From this plot you paste here, it seems good as different subtypes cluster together.

Best

0
Entering edit mode

Thank you for your response. To be honest, there was not any information regarding batch effect, so I tried to get some idea by PCA plotting based on cancer subtype. Could you please let me know what do you mean by "other variable", other than batch number?

1
Entering edit mode

What I mean is just some information of batch. But if you have no such info, then I think there is very little we can do on batch effect estimation and removal. Maybe you can go ahead with downstream analysis.