Question: RNAseq Tximport transcript counts to gene counts
gravatar for s.kyungyong64
2.5 years ago by
Berkeley, USA
s.kyungyong6440 wrote:


I have quant.sf files generated by Salmon by mapping 100bp single-end Illumina libraries against primary transcripts. The genome for this species is not so perfect and is missing some genes of interest. So I am using both transcriptome and genome data for this RNA-seq.

Before passing this quantification data, I needed to run Tximport and generated a table containing transcript ID and gene ID from a gff3 file based on the genome annotation. Then, I realized many of the primary transcripts were missing in the genome.

I was going to add the unique transcript ID and some arbitrary gene ID into the table. But is this okay? What would be the standard protocol to deal with this?


rna-seq salmon tximport • 1.1k views
ADD COMMENTlink modified 2.5 years ago by h.mon31k • written 2.5 years ago by s.kyungyong6440

So I am using both transcriptome and genome data for this RNA-seq.

How are you avoiding double counting entities shared between those two?

ADD REPLYlink written 2.5 years ago by genomax91k
gravatar for h.mon
2.5 years ago by
h.mon31k wrote:

I don't see a simple solution to your problem. Why the genome is missing these genes of interest? Are the genes absent from the reference, or they are present (you can find them with, e.g., blast), but unannotated? Are these genes present on your transcriptome assembly?

The simplest solution: use only the transcriptome. You may use Corset to build a transcript to gene map, and you can map the transcripts to the genome and use bedtools intersect, subtract and overlap to annotate the transcripts and to find which annotated genes are found / not found on your transcriptome, and vice-versa.

ADD COMMENTlink written 2.5 years ago by h.mon31k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1307 users visited in the last hour