How to calculate TPM?
5
3
Entering edit mode
7.5 years ago
moransharo ▴ 30

Hello, I'm new to RNA-seq and normalization... I would like to normalize raw RNA-seq data to TPM. I'm familiar with the logic behind it (thanks to the blog: http://www.rna-seqblog.com/rpkm-fpkm-and-tpm-clearly-explained/) However, I can't find a way to calculate it via R packages. I read that RSEM may be helpful, but I'm really not sure how to install and use it. It will be highly appreciated if someone could help... Thank you!

PKM RNA-Seq R • 18k views
ADD COMMENT
1
Entering edit mode
7.5 years ago
lh3 33k

Warning: I don't do RNA-seq often; my comments below may be inaccurate.

I was looking at an RNA-seq data set where only FPKM is provided. As I need raw read counts for edgeR-like analyses, I did a small research on how FPKM and the related TPM are calculated. I have also consulted Rob Patro for help. In the end, it seems to me that there are multiple subtly different ways to compute FPKM and TPM. FPKM/etc computed by different tools are often not comparable.

I think the most precise description of FPKM/etc is here. Importantly, to derive FPKM/etc from raw read counts, we need to compute the effective transcript length (the \tilde{l} in the link above). The exact approach to computing this value is tool dependent. Rob commented that:

A different approach [to computing effective length] (which is used in Salmon and kallisto) is to define the effective length of a transcript as L - \mu_{L}, where \mu_{L} is the mean of the fragment length distribution for all fragments of length <= L.

and mentioned that "the effective length can also be modified to account for sampling biases". There is not a single way to compute effective length and thus not a single way to compute FPKM/TPM.

As a side note, GTEx provided both raw counts and FPKM. I was trying to convert from counts to FPKM. However, it seems that GTEx is using an effective length longer than the transcript length, which would be impossible with Rob's formula or the formula in the link above...

In all, your question is not only about how to compute FPKM/TPM, but is also related to which flavor of FPKM/TPM to compute. If I were given such a task, I would take Rob's formula to compute effective length and the TPM formula in the linked webpage. Note that you need to know the insert size/fragment length distribution of your library in order to compute TPM accurately.

ADD COMMENT
0
Entering edit mode
7.5 years ago

Here: https://haroldpimentel.wordpress.com/2014/05/08/what-the-fpkm-a-review-rna-seq-expression-units/ there's a comparison between different manners of normalization, including a demo R script to convert read counts to TPMs and FPKMs without installing RSEM or edgeR. By the way, some programs like eXpress calculate TPMs and FPKMs while counting the mapped reads, so you won't need any further conversions.

There are other answers in this old thread: Using transcripts per million (TPM)

ADD COMMENT
0
Entering edit mode
7.5 years ago

Depending on why you need these TPM values, a solution would be to use commonly accepted tools like DESeq2 which do a more sophisticated normalization, and use these normalized counts instead of TPM. I see Israel Barrantes already suggested something similar.

ADD COMMENT
0
Entering edit mode
7.5 years ago
igor 13k

I am not sure what you mean by "raw". The other two answers are excellent, but assume you already have the counts, which means the data is partially processed.

If by "raw", you mean you have FASTQs, the easiest way to get TPMs is probably to use Kallisto, which requires only a single step: https://pachterlab.github.io/kallisto/manual . However, it's not much easier than RSEM, so if you had trouble using that, I am not sure what to recommend.

I don't think there is currently a method to get from FASTQs to TPMs entirely in R.

ADD COMMENT
0
Entering edit mode
7.5 years ago
Farbod ★ 3.4k

Dear moransharo, Hi

In this Trinity explanation of transcript quantification, you can find the TPM in the RSEM output and the related script.

Hope that helps

~ Best

ADD COMMENT

Login before adding your answer.

Traffic: 1500 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6