transcripts missing from tx2gene
0
0
Entering edit mode
14 months ago
Erick • 0

Hi All, I am learning to analyze bacterial RNAseq data and I am having difficulties using tximport. Any help will be greatly appreciated. I generated a transcriptome file from a gff file available and used it to align transcripts. The same transcriptome was used to perform Salmon analysis to generate quant files. I then generated a TxDb file using the same gff file that was used to make the transcriptome file above. When I however try to import the transcripts into tx2gene, I have 5015 transcripts missing from tx2gene. Here are the codes used.

#Generating a transcriptome fasta:

gffread -w transcripts.fasta -g NC_003198.fasta NC_003198.gff

#salmon analysis
conda activate salmon 
  salmon quant \
  -t ~/transcripts.fasta \
  -l A \
  -a ${file} \
  -o ${output} \
  --threads 8 
  conda deactivate  
done

#importing transcripts
library(readr)
quants <- read_tsv(quant_files[1])

library(GenomicFeatures)
txdb <- makeTxDbFromGFF("genome_data/NC_003198.gff.gz", format="gff3", dataSource = "NCBI")

k <- keys(txdb, keytype="TXNAME")
tx_map <- select(txdb, keys = k, columns="GENEID", keytype = "TXNAME")

library(tximport)
tx2gene <- tx_map
write.csv(tx2gene,file="tx2gene.csv",row.names = FALSE,quote=FALSE)
txi <- tximport(quant_files,type="salmon",tx2gene = tx2gene,ignoreTxVersion = TRUE)

#This gives me the result:

reading in files with read_tsv
1 2 3 
removing duplicated transcript rows from tx2gene
transcripts missing from tx2gene: 5015
summarizing abundance
summarizing counts
summarizing length

Many thanks in advance. Erick

tximport • 343 views
ADD COMMENT

Login before adding your answer.

Traffic: 1961 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6