Question: I get different ENSG IDs for the same MSTRG ID
Hi everyone.

I face a problem like I get Different ENSG IDs for the same MSTRG IDs. I used StringTie as the assembler but at the * StringTie_merged.gtf * file I get this problem that I have same MSTRG ID for genes that have a different name and different ENSG IDs. Does it seem that StringTie gives same MSTRG ID for those genes which overlap? How can I solve this problem?

Thank you in Advance.

hey, did you find an answer to your question? I have the same question

European Union
You seem to have cases of merged genes. Basically some genes are joined together by StringTie/Cufflinks because of genomimc overlap of associated transcripts.

To solve this I have just release an update to the R package IsoformSwitchAnalyzeR (available in >1.11.6) which can fix problem 1 and 2 for most genes. You simply use the importRdata() function - which will fix the isoform annotation which is fixable and clean up the rest of the annotation. From the resulting switchAnalyzeRList object you can analyse isoform switches with predicted functional consequences with IsoformSwitchAnalyzeR or use extractGeneExpression() to get a gene count matrix for DE analysis with other tools.

Hope this helps.



