Convert FPKM to TPM in R
Entering edit mode
8 weeks ago
JACKY ▴ 60

I'm conducting a meta-analysis over several datasets. I want to combine those datasets and run some machine learning algorithms to predict a target response. Some of those datasets are raw counts, which I can easily convert to TPM with the following code:

rpkm <- apply(X = subset(counts_data),
                MARGIN = 2,
                FUN = function(x) {
                  10^9 * x / genelength / sum(as.numeric(x))

TPM <- apply(rpkm, 2, function(x) x / sum(as.numeric(x)) * 10^6) %>%

And some datasets provide RPKM data, which I can also convert to TPM like this:

TPM= apply(RPKM, 2, function(x) x / sum(as.numeric(x)) * 10^6) %>%

Some datasets, however, only provide FPKM data. This is problamatic, I need all datasets to be TPM normalized, and I'm not familiar with converting FPKM to TPM.

Is it possible to convert FPKM reads to TPM? I found this approach: TPM = FPKM*X where X = 1e6/[sum of all FPKM of a sample].

I'm not sure if I'm allowed to do this, I don't want to use it and get misleading results. What to do guys think? if I can use it, what is the code in R?

Note: the datasets that provide RPKM or FPKM have no raw data or counts data.

meta-analysis TPM r normalization • 270 views
Entering edit mode
8 weeks ago

I think your TPM from FPKM calculation is correct. See the section Relationship between TPM and FPKM in this helpful blog post by Harold Pimentel that recites a manuscript by his PhD advisor Lior Patcher.

Entering edit mode

Great! Thank you! If anyone in the future need a solution for this question, here is the code to do this:

library(tidyverse); fpkm_data%>% mutate(across(everything(), ~(./sum(.))*10**6)

Login before adding your answer.

Traffic: 2260 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6