Question

error when run tximport for salmon files

2

Entering edit mode

6.5 years ago

Lila M ★ 1.2k

Hi guys, I'm trying to analyze some RNA-seq data using salmon as follow:

#create the index:
salmon index -t gencode.v27.transcripts.fa -i human_index

#cretae the quant.sf files:
salmon quant -i human_index/ -l OSR -1 R1.fastq -2 R2.fastq -o salmon_quant

After that, my idea is to process all the files (1Q_S1_quant.sf, 2Q_S2_quant.sf .....16Q_S16_quant.sf) in R for downstream analysis with DESeq2, to do that I've tried:

library(GenomicFeatures)
library(tximport)
library(readr)
library(rjson)

## Create a transcript-to-gene matching table (tx2gene) that will be used to aggregate transcript quantifications 
## Salmon to the gene level

txdb <-makeTxDbFromGFF("gencode.v27.annotation.gtf")
k <- keys(txdb, keytype = "GENEID")
df <- select(txdb, keys = k,  columns = "TXNAME", keytype = "GENEID")
tx2gene <- df[, 2:1]
head(tx2gene)

## load salmon files
files <- list.files( pattern = "quant.sf",full.names = TRUE)
names(files) <- paste0("sample", 1:16)
all(file.exists(files))
#TRUE

txi_salmon <- tximport(files = files, type = "salmon", txOut = FALSE, tx2gene = tx2gene)reading in files with read_tsv
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 
Error in summarizeToGene(txi, tx2gene, ignoreTxVersion, countsFromAbundance) : 
    None of the transcripts in the quantification files are present
  in the first column of tx2gene. Check to see that you are using
  the same annotation for both.

But that is not true at all, because I look in both files (quant.sf and tx2gene) and the same transcript for the same gene is present in both files (eg):

#tx2gene
TXNAME                    GENEID
ENST00000373031.4   ENSG00000000005.5
ENST00000485971.1   ENSG00000000005.5

#1Q_S1.quant.sf
ENST00000373031.4|ENSG00000000005.5|OTTHUMG00000022001.1|OTTHUMT00000057481.1|TNMD-201|TNMD|1339|protein_coding|    1339    1156.86 0   0
ENST00000485971.1|ENSG00000000005.5|OTTHUMG00000022001.1|OTTHUMT00000057482.1|TNMD-202|TNMD|542|processed_transcript|   542 360.895 0   0

Any suggestions about what's going on with this funny error?

Thanks!

RNA-Seq salmon quantification R error txtimport • 6.8k views

ADD COMMENT • link 6.5 years ago by Lila M ★ 1.2k

1

Entering edit mode

Hint: compare the first columns of the two files your posted. You'll note that they're not exactly the same. That's causing the error.

ADD REPLY • link 6.5 years ago by Devon Ryan 104k

0

Entering edit mode

Hi Devon, can you explain how can I solve it? Thanks!

ADD REPLY • link 6.5 years ago by Lila M ★ 1.2k

1

Entering edit mode

You can probably do something like sed -e 's/\|.*\t/\t/' 1Q_S1.quant.sf.

ADD REPLY • link 6.5 years ago by Devon Ryan 104k

score 2 · Answer 1 · 2017-10-12

2

Entering edit mode

6.5 years ago

e.rempel ★ 1.1k

ADD COMMENT • link 6.5 years ago by e.rempel ★ 1.1k

0

Entering edit mode

I'm a bit stuck here, can you please let me know how to do that? or at which point? Thanks!

ADD REPLY • link 6.5 years ago by Lila M ★ 1.2k

1

Entering edit mode

After you have checked that all files are here, you could do something like

rownames(1Q_S1.quant.sf) <- limma::strsplit2(rownames(1Q_S1.quant.sf), split = "|", fixed = T)[,1])

meaning that you split your rownames taking | as separator and then take only the first entry

ADD REPLY • link 6.5 years ago by e.rempel ★ 1.1k

0

Entering edit mode

I have a follow up question...

If I'm using file.path to import all my quant.sf files into R, is there a way of correcting this space issue for all files? I'm getting the same error message and I know it is because of the lack of a space between my transcript_id and the "|".

 dir <- "/mnt/data/BM/Total_RNAseq/salmon/protein_coding"

    files <- file.path(dir, samplefile$sampleID, "quant.sf")

    annotation_transcript <- elementMetadata(import(gtf_file, feature.type = "transcript"))

    tx2gene <- annotation_transcript[,c("transcript_id", "gene_id")]

    txi.salmon <- tximport(files, type = "salmon", tx2gene = tx2gene)

Thanks in advance!

ADD REPLY • link 3.9 years ago by 2405592M ▴ 140

score 1 · Answer 2 · 2017-10-12

1

Entering edit mode

6.5 years ago

Lila M ★ 1.2k

problem solved! Thanks!

ADD COMMENT • link 6.5 years ago by Lila M ★ 1.2k