Question

tximport error (medianLengthOverIsoform)

0

Entering edit mode

4 days ago

LGG • 0

Hi,

I'm trying to do a DTU analysis in R on my RNA-seq data. I ran Salmon in mapping mode, using Homo_sapiens.GRCh37.cdna.all.fa to create the index.

Now I want to run tximport, but I keep getting this error:

Error in medianLengthOverIsoform(length4CFA, tx2gene, ignoreTxVersion,  :
  all(txId %in% tx2gene$tx) is not TRUE
Calls: tximport -> medianLengthOverIsoform -> stopifnot
Execution halted

Here's my code:

library(GenomicFeatures)
library(tximport)

gtf <- "../Homo_sapiens.GRCh37.87.gtf.gz"
txdb <- makeTxDbFromGFF(gtf)
k <- keys(txdb, keytype="TXNAME")
tx2gene <- AnnotationDbi::select(txdb, keys=k, columns=c("TXNAME", "GENEID"), keytype="TXNAME")

colnames(tx2gene) <- c("tx","gene")

salmon_files <- list.files("/path/to/files")

samples <- data.frame(
  sample = salmon_files,
  path = sprintf("/path/to/files/%s", salmon_files)
)

# Import transcript-level Salmon quantifications
# Build files vector: for tximport, point to quant.sf file paths
files <- file.path(samples$path, "quant.sf")
names(files) <- samples$sample

txi <- tximport(files, type="salmon", txOut = TRUE, tx2gene = tx2gene, countsFromAbundance = "dtuScaledTPM", ignoreTxVersion=TRUE)

Here is the tx2gene file and one of my quant.sf files:

> head(tx2gene)

              tx            gene
1 ENST00000456328 ENSG00000223972
2 ENST00000515242 ENSG00000223972
3 ENST00000518655 ENSG00000223972
4 ENST00000450305 ENSG00000223972
5 ENST00000473358 ENSG00000243485
6 ENST00000469289 ENSG00000243485

>  head(quant.sf)

               Name Length EffectiveLength TPM NumReads
1 ENST00000415118.1      8               8   0        0
2 ENST00000434970.2      9               9   0        0
3 ENST00000448914.1     13              13   0        0
4 ENST00000604642.1     23              23   0        0
5 ENST00000603326.1     19              19   0        0
6 ENST00000604950.1     31              31   0        0

If anyone could help me figure this out I would greatly appreciate it :)

DTU RNA-seq tximport • 303 views

ADD COMMENT • link 2 hours ago by LGG • 0

0

Entering edit mode

Is it the version number in your data?

e.g. ENST00000415118.1 vs ENST00000415118 could be causing the mismatch?

ADD REPLY • link 3 days ago by rfran010 ★ 1.7k

0

Entering edit mode

I believe using ignoreTxVersion = TRUE is supposed to take care of that

ADD REPLY • link 2 hours ago by LGG • 0