DE analysis of genes with different number of samples in Tumor and control group
0
0
Entering edit mode
5.0 years ago
hAjmal ▴ 50

Hi, I am usng TCGA biolinks to do DE analysis of genes in the breast cancer TCGA data. The number of samples in normal and tumor groups are different. Is is possible to do DE analysis with different number of samples in each group? Please guide.

TCGA Differential Expression TCGA Biolinks limma • 1.2k views
ADD COMMENT
0
Entering edit mode

Hi hAjmal, how many is "different"? Please give some numbers.

ADD REPLY
0
Entering edit mode

112 normal and 1100 tumor samples

ADD REPLY
1
Entering edit mode

I would guess (not being a statistician) that the dispersion estimates at these high sample numbers should be sufficiently stable regardless of the uneven group sizes. You can of course run several analysis with subsets of the tumor group and see if the results are stable when subsampling.

ADD REPLY
1
Entering edit mode

If these are a randomly selected set of normal and tumour samples then it's fine. The issue I suspect you may have here, is that the normal may actually be non-tumour tissue from a tumour-proximal site in a subset of the cancer patients; if so, you will need to match the patient-derived samples.

ADD REPLY
0
Entering edit mode

As your normal is too less compared to case, you may get biiased result..... Increase the number of Normal or lessen the number of case. Equal set is always preffered.

ADD REPLY
0
Entering edit mode

The number of samples is not a concern here, the batch effect is. You best option is to match tumor and healthy, another option is to add the batch as a covariate, let's hope the tumor and healthy are not grouped separately.

ADD REPLY

Login before adding your answer.

Traffic: 1610 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6