CPM or TPM in GSEA analysis
3
0
Entering edit mode
3.0 years ago
francesca3 ▴ 80

Dear all, I have a question. It is better to perform this analysis using CPM or TPM values? Other times I performed this analysis using TPM. This time I have the replicate values only for CPM, while for TPM I have only the mean for each condition. Which is the best way?

Thanks Francesca

RNA-Seq GSEA • 4.5k views
0
Entering edit mode

Which tool are you using? When I use FGSEA, I pass in a differential-expression summary statistic, rather than values from individual samples. For a given gene, the statistic I use is (-log10 of unadjusted p-value) * (sign of differential expression coefficient).

0
Entering edit mode

As others have mentioned TPM is generally better - however the choice may be tool-specific.

For example, you cannot use TPM normalised counts for differential gene expression analysis with EdgeR.

7
Entering edit mode
9 weeks ago
dare_devil ★ 1.7k

As a input to the Broad Institute's GSEA program, one should use any type of expression data which is properly normalised such that cross-sample differences can be faithfully gauged.

That means using any of these:

1. normalised RNA-seq counts via DESeq2's 'geometric' normalisation, EdgeR's TMM method, etc
2. normalised + transformed RNA-seq expression levels, such as variance-stabilised (vst) or regularised log (rlog) expression levels from DESeq2, or log2 CPMs from EdgeR
3. normalised microarray data via RMA, GC-RMA, MAS5, neqc, etc

And one should not use raw counts or any of these types of expression levels: FPKM, RPKM, TPM etc.

0
Entering edit mode

Why would FPKM/RPKM/TPM not be acceptable? I would find those superior to using CPM.

0
Entering edit mode

Try to understand the difference between across samples / between samples and within samples while comparison

0
Entering edit mode

This is incorrect since GSEA uses the gene expression ranks so you should use the normalized counts (e.g. TPM).

0
Entering edit mode

Simply making a comment incorrect do not make your statement valid. Give enough proofs to stand with your comment. How do TPM deals with cross-sample differences?

2
Entering edit mode
3.0 years ago

For most intents and purpose TPM is superior to CPM (at least for short read RNA-seq; where the length of a given transcript affects the number of reads produced from said transcript). However, CPM should give comparable results to using TPM. In your case, I would try both and compare the results.

1
Entering edit mode
3.0 years ago

Use TPM. CPM are not normalized for gene length length meaning longer genes will appear more highly expressed when using CPM (and vise versa).