transcripts missing from tx2gene
Entering edit mode
14 months ago
Erick • 0

Hi All, I am learning to analyze bacterial RNAseq data and I am having difficulties using tximport. Any help will be greatly appreciated. I generated a transcriptome file from a gff file available and used it to align transcripts. The same transcriptome was used to perform Salmon analysis to generate quant files. I then generated a TxDb file using the same gff file that was used to make the transcriptome file above. When I however try to import the transcripts into tx2gene, I have 5015 transcripts missing from tx2gene. Here are the codes used.

#Generating a transcriptome fasta:

gffread -w transcripts.fasta -g NC_003198.fasta NC_003198.gff

#salmon analysis
conda activate salmon 
  salmon quant \
  -t ~/transcripts.fasta \
  -l A \
  -a ${file} \
  -o ${output} \
  --threads 8 
  conda deactivate  

#importing transcripts
quants <- read_tsv(quant_files[1])

txdb <- makeTxDbFromGFF("genome_data/NC_003198.gff.gz", format="gff3", dataSource = "NCBI")

k <- keys(txdb, keytype="TXNAME")
tx_map <- select(txdb, keys = k, columns="GENEID", keytype = "TXNAME")

tx2gene <- tx_map
write.csv(tx2gene,file="tx2gene.csv",row.names = FALSE,quote=FALSE)
txi <- tximport(quant_files,type="salmon",tx2gene = tx2gene,ignoreTxVersion = TRUE)

#This gives me the result:

reading in files with read_tsv
1 2 3 
removing duplicated transcript rows from tx2gene
transcripts missing from tx2gene: 5015
summarizing abundance
summarizing counts
summarizing length

Many thanks in advance. Erick

tximport • 343 views

Login before adding your answer.

Traffic: 1961 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6