Raw counts to TPM in R
1
0
Entering edit mode
2.7 years ago
rmf ★ 1.1k

Can someone verify if this R code for converting raw counts to TPM is correct?

#' @title Compute TPM for a read count matrix
#' @param dfr A numeric data.frame of read counts with samples (columns) and genes (rows).
#' @param len A vector of gene cds length equal to number of rows of dfr.
#' 
r_tpm <- function(dfr,len)
{
  dfr1 <- sweep(dfr,MARGIN=1,(len/10^4),`/`)
  scf <- colSums(dfr1)/(10^6)
  return(sweep(dfr1,2,scf,`/`))
}
RNA-Seq R • 17k views
ADD COMMENT
0
Entering edit mode

Do you have a reason for suspecting it's not? Have you tested it? What do your tests reveal?

ADD REPLY
12
Entering edit mode
2.7 years ago
ATpoint 49k

Use this code snippet from Michael Love.

ADD COMMENT
3
Entering edit mode

Cool! I get the same result.

# michael's version
# https://support.bioconductor.org/p/91218/

tpm3 <- function(counts,len) {
  x <- counts/len
  return(t(t(x)*1e6/colSums(x)))
}

Michael's version is much faster despite all the transposes.

enter image description here

ADD REPLY
0
Entering edit mode

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.
Upvote|Bookmark|Accept

ADD REPLY

Login before adding your answer.

Traffic: 2289 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6