Question: NCBI gtf contains same trasncript id for some trasncripts and giving trouble on htseqcount
gravatar for patelbhaumikn
7 weeks ago by
University of missouri
patelbhaumikn0 wrote:

Hi everyone,

I am doing the differential expression of two groups. I have used ncbi refseq gtf and ran hisat2 and string tie without e option. When I looked at the stringtie output gtf individual and after merging them, it contains some transcript id as " unknown_transcript_1" for a some transcripts. I looked at initial ncbi gtf and some data have gene_id blank and have transcript_id as "unknown_transcript_1" for some transcripts. They are mostly from mitochondrial and scaffold part of genome. so when I ran the htseqcount I got first row as empty gene_id with read numbers. Should I exclude that row from htseqcount and do differential expression ? I have done same job with ensemble gtf and there was no issue like that.

I will really appreciate your help if you guys can suggest and give your recommendation for it .

Thanks in advance.


ADD COMMENTlink written 7 weeks ago by patelbhaumikn0

@patelbhaumikn please contact RefSeq to report this issue with GTF files.

ADD REPLYlink written 7 weeks ago by vkkodali2.4k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2457 users visited in the last hour