Error in apply(counts, 2, function(x) rpkm(x, lengths)) : dim(X) must have a positive length
0
0
Entering edit mode
3.1 years ago
shail.nair05 ▴ 20

I am trying to convert featurecounts output (raw reads of transcripts) to TPM via tpm_rpkm.R(https://github.com/andysaurin/tpm_rpkm) script. but i am getting error saying Error in apply(counts, 2, function(x) rpkm(x, lengths)) : dim(X) must have a positive length

Here is the script

#! /usr/bin/env Rscript

# Author: Andy Saurin (--------------------)
#
# Simple RScript to calculate RPKMs and TPMs
# based on method for RPKM/TPM calculations shown in http://www.rna-seqblog.com/rpkm-fpkm-and-tpm-clearly-explained/
#
# The input file is the output of featureCounts
#

rpkm <- function(counts, lengths) {
  pm <- sum(counts) /1e6
  rpm <- counts/pm
  rpm/(lengths/1000)
}

tpm <- function(counts, lengths) {
  rpk <- counts/(lengths/1000)
  coef <- sum(rpk) / 1e6
  rpk/coef
}


## read table from featureCounts output
args <- commandArgs(T)

tag <- tools::file_path_sans_ext(args[1])


cat('Reading in featureCounts data...')
ftr.cnt <- read.table(args[1], sep="\t", header=T, quote="") #Important to disable default quote behaviour or else genes with apostrophes will be taken as strings
cat(' Done\n')

if ( ncol(ftr.cnt) < 7 ) { 
    cat(' The input file is not the raw output of featureCounts (number of columns > 6) \n')
    quit('no')
}

lengths = ftr.cnt[,6]

counts <- ftr.cnt[,7:ncol(ftr.cnt)]

cat('Performing RPKM calculations...')

rpkms <- apply(counts, 2, function(x) rpkm(x, lengths) )
ftr.rpkm <- cbind(ftr.cnt[,1:6], rpkms)

rpkms <- apply(counts, 2, function(x) rpkm(x, lengths) )
ftr.rpkm <- cbind(ftr.cnt[,1:6], rpkms)
write.table(ftr.rpkm, file=paste0(tag, "_rpkm.txt"), sep="\t", row.names=FALSE, quote=FALSE)
cat(' Done.\n\tSaved as ')
cat ( paste0(tag, "_rpkm.txt", '\n') )

cat('Performing TPM calculations...')

tpms <- apply(counts, 2, function(x) tpm(x, lengths) )

ftr.tpm <- cbind(ftr.cnt[,1:6], tpms)

write.table(ftr.tpm, file=paste0(tag, "_tpm.txt"), sep="\t", row.names=FALSE, quote=FALSE)
cat(' Done.\n\tSaved as ')
cat ( paste0(tag, "_tpm.txt", '\n') )


quit('no')




**command output**

Rscript tpm_rpkm.R 450-3-hard_filtered.featureCounts
Reading in featureCounts data... Done
Performing RPKM calculations...Error in apply(counts, 2, function(x) rpkm(x, lengths)) :
dim(X) must have a positive length
halt execution

My featurecount table looks like this

Geneid | Chr | Start | End | Strand | Length |
1_1 | NODE_1_length_59711_cov_84.026979_g0_i0 | 116 | 904 | + | 789 | 198
1_2 | NODE_1_length_59711_cov_84.026979_g0_i0 | 1178 | 3514 | - | 2337 | 2294
1_3 | NODE_1_length_59711_cov_84.026979_g0_i0 | 3618 | 4319 | + | 702 | 502
1_4 | NODE_1_length_59711_cov_84.026979_g0_i0 | 4337 | 4921 | + | 585 | 320
1_5 | NODE_1_length_59711_cov_84.026979_g0_i0 | 4953 | 5906 | + | 954 | 799
1_6 | NODE_1_length_59711_cov_84.026979_g0_i0 | 5920 | 7056 | + | 1137 | 532
1_7 | NODE_1_length_59711_cov_84.026979_g0_i0 | 7061 | 8071 | + | 1011 | 761
1_8 | NODE_1_length_59711_cov_84.026979_g0_i0 | 8068 | 8766 | + | 699 | 188
1_9 | NODE_1_length_59711_cov_84.026979_g0_i0 | 8766 | 9656 | + | 891 | 217
1_10 | NODE_1_length_59711_cov_84.026979_g0_i0 | 9640 | 10710 | + | 1071 | 408
1_11 | NODE_1_length_59711_cov_84.026979_g0_i0 | 10692 | 11348 | + | 657 | 162
1_12 | NODE_1_length_59711_cov_84.026979_g0_i0 | 11359 | 12282 | + | 924 | 342

Does anyone know how to deal with this?

Rscript R • 1.7k views
ADD COMMENT
0
Entering edit mode
tr.cnt <- read.table(args[1], sep="\t", header=T, quote="")

should be

tr.cnt <- read.table(args[1], sep="|", header=T, quote="", strip.white=TRUE)
ADD REPLY
0
Entering edit mode

That did not work. It threw error saying

The input file is not the raw output of featureCounts (number of columns > 6)

But i got the solution from https://stackoverflow.com/questions/66762378/error-in-applycounts-2-functionx-rpkmx-lengths-dimx-must-have-a-pos

Updating the counts <- ftr.cnt[,7:ncol(ftr.cnt), drop=FALSE]

worked smoothly.

Thanks

ADD REPLY
0
Entering edit mode

Posting the same question on multiple forums is bad etiquette. Even if you do cross-posts, it is common decency to mention your cross-posts so people don't invest effort when you've already been helped elsewhere.

ADD REPLY
0
Entering edit mode

Sorry for that. will keep in mind next time. thanks

ADD REPLY

Login before adding your answer.

Traffic: 2875 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6