Question

Salmon tximport get TPM

2

Entering edit mode

4.7 years ago

bharata1803 ▴ 560

Hello,

In Salmon result, there is a TPM column but it is from transcript level. I want to use tximport library to get TPM values for gene level.

per_gene <- tximport(salmon_file_list, type = "salmon", tx2gene = trs2gene)

I use that code and I get this:

names(per_gene)
[1] "abundance"           "counts"              "length"              "countsFromAbundance"

There are no TPM but I read there is a parameter countsFromAbundance with scaledTPM and lengthScaledTPM.

Is this parameter can be used to get TPM? Which values (scaledTPM or lengthScaledTPM) are better if I want to compare TPM accross different experiment (i.e independent RNA-seq experiment but same cll type)? How can I get this from the result? is per_gene["countsFromAbundance"] store the values?

Thank you

RNA-Seq Salmon • 6.9k views

ADD COMMENT • link 4.7 years ago by bharata1803 ▴ 560

1

Entering edit mode

I see. Thank you for your feedback. I will use rlog or VST because I usually use that. I totally forget about it and just think maybe I can use TPM directly.

ADD REPLY • link 4.7 years ago by bharata1803 ▴ 560

score 3 · Answer 1 · 2019-08-29

None of it. TPM is not a robust measure for inter-sample comparison and was never developed to be one. You better compare normalized counts you get from e.g. DESeq2, edgeR etc or use data transformations such as vst or rlog. Please use the search function as this was discussed many times before. If you insist on TPM, aggregate counts to the gene level with tximport and then calculate TPM with the gene length information produced by tximport as described https://support.bioconductor.org/p/91218/

score 1 · Answer 2 · 2019-08-29

Appart from agreeing with ATpoint about needing inter-library normalisation (read more about that here) I just want to point out that the TPM values are stored in the "abundance" entry of the "per_gene" list. With regards to countsFromAbundance that is recommended but the exact way to do it depends a bit on what you want to use the counts for - I'd recomend "scaledTPM" as they are the most universally usable but you can read figure 1 of this article.