Question: RPKM, FPKM, TPM, TMM... using the right value for different tasks
I would like to be sure I am using the right metrics for 3 different tasks I am performing on mRNA-seq data I have got on 50 different samples (i.e. 50 different cell lines). I have read a lot about this but I am still in doubt..

Here the scenario, 50 different cell lines, for each of them I got mRNA-seq data (single measurement due to high costs) and I have to complete the following 3 tasks:


I want to evaluate the abundance distribution of the genes within each sample, meaning that I want to see whether in certain cel lines the genes are expressed more or less at the same level or if in some samples there are a set of genes that are for example 10-20 times more abundant than the others. To do this I can use RPKM or TPM, whit the latter probably more appropriate


compare the expression level of all the genes among the different sample, doing basically a sort of PCA or hierarchical clustering on the data. As I will have to compare the 50 different data sets, I would need to use RPKM or TMM (TMM normalized FPKM), with the latter the most appropriate one?


I would like also to compare the mRNA level of selected genes against the proteins abundance of the relative gene-product. For this task I will have to compare the protein iBAQ (some sort of protein absolute abundance) against a mRNA value. As the corresponding 'absolute' mRNA value does not exist (as far as I understood), I guess here either TPM , RPKM, or TMM are valid.

Did I get it right?

rna-seq mrna • 1.3k views
Why dont you use HTSeq it gives you transcripts count. Then move to DeSeq or EdgeR Pipeline please refer this

