Warning: gene "NR_111945" (on chr1) has reference transcripts on both strands?
1
0
Entering edit mode
5.1 years ago

Hello, I am using Stringtie tool for quantification purpose, in order to skip 'MSTRG.xx' , this time I have used -e option in the following command

stringtie '/path/to/input_file/inputfile.bam' -G '/path/to/ref/hg38ucsc.gtf' -o output.gtf -p 8 -e -A output_abundance.out

Here I am facing the following warning cum question in the terminal.

Warning: gene "NR_111945" (on chr1) has reference transcripts on both strands?

In my output file, I could find transcript_id NR_111945 and NR_111945_dup1

chr1 hg38_refGene transcript 13094515 13099717 . - . gene_id "NR_111945"; transcript_id "NR_111945"; cov "0.0"; FPKM "0.0"; TPM "0.0";

chr1 hg38_refGene exon 13094515 13095762 . - . gene_id > "NR_111945"; transcript_id "NR_111945"; exon_number "1"; cov "0.0";

and

chr1 hg38_refGene transcript 13127069 13132274 . + . gene_id "NR_111945"; transcript_id "NR_111945_dup1"; cov "0.0"; FPKM "0.0"; TPM "0.0";

chr1 hg38_refGene exon 13127069 13127146 . + . gene_id "NR_111945"; transcript_id NR_111945_dup1"; exon_number "1"; cov "0.0";

What could be the reason for such duplicates? or is it okay to face such kind of warning? Additionally, what can be done to avoid such kind of warning, if it is not due to Biological characteristics?

Thank you in advance.

RNA-Seq Stringtie • 2.3k views
ADD COMMENT
1
Entering edit mode
5.1 years ago

Whenever possible, use annotations and reference genomes from Ensembl or Gencode as opposed to any other source (i.e., avoid UCSC, NCBI, and any other source). Sources such as UCSC will commonly use the same identifier for paralogs, which will often not make biological sense (such as your case). Ensembl/Gencode do not do this, so their annotations will not produce such errors and are generally easier to deal with in general.

ADD COMMENT

Login before adding your answer.

Traffic: 2384 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6