collapseReplicates in DESeq2 when the number of technical replicates per biosample is different
1
1
Entering edit mode
2.7 years ago
poecile.pal ▴ 50

Good afternoon,

I have a question about using collapseReplicates in DESeq2. As far as I understand, this function adds up the counts belonging to one biosample. I understand the meaning of this if the number of technical replicates per biosample is the same for all samples. Please tell me what should I do if the number of technical replicates per sample differs?

For example, SAMNXXXXX corresponds to SRRXXXXX1 (count = 5) and SRRXXXXX2 (count = 6), SAMNYYYYY corresponds only to SRRYYYYY1 (count = 10). If I add up the counts for SAMNXXXXX (5 + 6 = 11) and then compare it with count for SAMNYYYYY (10), I will get an incorrect conclusion that the expression is higher in SAMNXXXXX.

Maybe I need to take the arithmetic mean or something else? It seems to me that the arithmetic mean is not very reasonable. For example, I have counts of 181 and 2 for different replicates of the same biosample.

Note: this situation is not observed for most samples. For example, in a particular dataset there are 89 biosamples without technical replicates and and 5 biosamples with 2 technical replicates in each.

Thanks!

Good regards, Poecile

collapseReplicates DESeq2 RNASeq expression DEG • 1.7k views
2
Entering edit mode
2.7 years ago

Collasping technical replicates takes place before DESeq2 internal normalization, so you don't need to worry about the arithmetrics. Difference in sequencing depth will be handled exactly as with non-collapsed samples: by dividing the counts by a size factor calculated based on the median of ratio method. Therefore, it is ok to not have the same number of technical replicates by biosample.

0
Entering edit mode

Thank you very much! Сould you please confirm that I am acting in the correct order?

chopped.txi_ex <- tximport(files_ex, type="salmon", ignoreAfterBar = TRUE, ignoreTxVersion   = TRUE, tx2gene=chopped.tx2gene[,c("tx_id", "ensgene")], countsFromAbundance="lengthScaledTPM")
ddsg_ex <- DESeqDataSetFromTximport(chopped.txi_ex, colData = metag_ex, design = ~ diag)
ddsg_ex <- collapseReplicates(ddsg_ex,
groupby = ddsg_ex$experimentg, run = ddsg_ex$rung)
ddsg_ex <- DESeq(ddsg_ex)
contrast_oeg_ex <- c("diag", "cancer", "normal")
res_tableOEg_ex <- DESeq2::results(ddsg_ex, contrast=contrast_oeg_ex, alpha = 0.05)


e t.c.

0
Entering edit mode

Normalization happens after collapseReplicates, during this step:

ddsg_ex <- DESeq(ddsg_ex)


So you are all good !

0
Entering edit mode