Question: Can I ignore these MSTRG genes in downstream analysis (pantherdb.org)?
2
gravatar for Fawzi Yassine
5 months ago by
Fawzi Yassine10 wrote:

Hi,

I am using RNAseq analysis to find genes differentially expressed between 2 conditions. I am using StringTie for transcript assembly and quantification. I am using prepDE.py in order to use StringTie with DESeq2 as instructed on http://ccb.jhu.edu/software/stringtie/index.shtml?t=manual#deseq which outputs gene_count_matrix.csv? This file has Gene IDs. Some of them had gene like NM_000144 which was convenient to do downstream analysis after. But others of my data had rows with MSTRAG tag. Can I ignore these MSTRG genes in downstream analysis (Enrichment Analysis at pantherdb.oorg)? If not, how can I get the corresponding gene symbols? regards,

rna-seq deseq2 stringtie • 327 views
ADD COMMENTlink modified 4 months ago by Biostar ♦♦ 20 • written 5 months ago by Fawzi Yassine10

Check this out How to deal with MSTRG tag without relevant gene name?

ADD REPLYlink written 5 months ago by lakhujanivijay4.4k

I did not understand this reply from the link you provided. "If you are interested only standard transcripts/genes (i.e Ensembl, all or targeted), it is okay to exclude MSTRG transcripts/genes for downstream analysis. But do not throw away those genes/transcripts. "

ADD REPLYlink written 5 months ago by Fawzi Yassine10
1

If you work with human or mouse (probably the most well-annotated organisms when it comes to genomics) why do you use stringtie at all? There are comprehensive annotations from GENCODE/Ensembl or RefSeq that you can quantify against. Transcript assembly is probably only beneficial if you look for new transcripts but not in standard analysis. Also keep in mind that transcript assembly probably requires quiet some sequencing depth and read length, so why the effort for standard DE analysis? I would simply quantify with salmon against Gencode transcriptome and then proceed with tximport and DESeq2. You would probably need to verify new transcripts from stringtie anyway to show that they are reliable and not artifacts, so save yourself the trouble.

ADD REPLYlink written 5 months ago by ATpoint23k

ATpoint I have always liked your replys But not this one. I have already done the assembly using stringtie (on AWS). Moreover I promised my would be employer to use stringtie I am only getting 167 proper gene id’s out of the 4077 significantly different genes. The rest have MSTRG tags in their id’s.

ADD REPLYlink written 5 months ago by Fawzi Yassine10

Well, you don't have to like a reply, of course, but then why do you ask for help? :)

ADD REPLYlink written 5 months ago by WouterDeCoster41k

ATpoint is a professional person so he wilil rightly think that I am complementing him in that reply, especially that I asked him another question.

ADD REPLYlink written 5 months ago by Fawzi Yassine10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2064 users visited in the last hour