Convert FPKM to TPM in R
1
2
Entering edit mode
8 weeks ago
JACKY ▴ 60

I'm conducting a meta-analysis over several datasets. I want to combine those datasets and run some machine learning algorithms to predict a target response. Some of those datasets are raw counts, which I can easily convert to TPM with the following code:

rpkm <- apply(X = subset(counts_data),
MARGIN = 2,
FUN = function(x) {
10^9 * x / genelength / sum(as.numeric(x))
})

TPM <- apply(rpkm, 2, function(x) x / sum(as.numeric(x)) * 10^6) %>% as.data.frame()


And some datasets provide RPKM data, which I can also convert to TPM like this:

TPM= apply(RPKM, 2, function(x) x / sum(as.numeric(x)) * 10^6) %>% as.data.frame()


Some datasets, however, only provide FPKM data. This is problamatic, I need all datasets to be TPM normalized, and I'm not familiar with converting FPKM to TPM.

Is it possible to convert FPKM reads to TPM? I found this approach: TPM = FPKM*X where X = 1e6/[sum of all FPKM of a sample].

I'm not sure if I'm allowed to do this, I don't want to use it and get misleading results. What to do guys think? if I can use it, what is the code in R?

Note: the datasets that provide RPKM or FPKM have no raw data or counts data.

meta-analysis TPM r normalization • 270 views
0
Entering edit mode
8 weeks ago

I think your TPM from FPKM calculation is correct. See the section Relationship between TPM and FPKM in this helpful blog post by Harold Pimentel that recites a manuscript by his PhD advisor Lior Patcher.

2
Entering edit mode

Great! Thank you! If anyone in the future need a solution for this question, here is the code to do this:

library(tidyverse); fpkm_data%>% mutate(across(everything(), ~(./sum(.))*10**6)