Raw counts to TPM in R
1
1
Entering edit mode
6.2 years ago
firestar ★ 1.6k

Can someone verify if this R code for converting raw counts to TPM is correct?

#' @title Compute TPM for a read count matrix
#' @param dfr A numeric data.frame of read counts with samples (columns) and genes (rows).
#' @param len A vector of gene cds length equal to number of rows of dfr.
#' 
r_tpm <- function(dfr,len)
{
  dfr1 <- sweep(dfr,MARGIN=1,(len/10^4),`/`)
  scf <- colSums(dfr1)/(10^6)
  return(sweep(dfr1,2,scf,`/`))
}
RNA-Seq R • 42k views
ADD COMMENT
0
Entering edit mode

Do you have a reason for suspecting it's not? Have you tested it? What do your tests reveal?

ADD REPLY
22
Entering edit mode
6.2 years ago
ATpoint 85k

Use this code snippet from Michael Love (DESeq2 developer)

x <- counts.mat / gene.length
tpm.mat <- t( t(x) * 1e6 / colSums(x) )
ADD COMMENT
9
Entering edit mode

Cool! I get the same result.

# michael's version
# https://support.bioconductor.org/p/91218/

tpm3 <- function(counts,len) {
  x <- counts/len
  return(t(t(x)*1e6/colSums(x)))
}

Michael's version is much faster despite all the transposes.

enter image description here

ADD REPLY
1
Entering edit mode

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.
Upvote|Bookmark|Accept

ADD REPLY

Login before adding your answer.

Traffic: 2201 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6