Identifying novel mRNA transcripts
1
0
Entering edit mode
6 weeks ago

After performing, Hisat2-stringTie and DESEQ2 how can I identify if there are any novel mRNA transcripts present in my significant genes or if so which tool do I use? Can someone help me out here? Thanks in advance!

RNA-Seq • 120 views
2
Entering edit mode
6 weeks ago

Hi, I believe the novel transcripts will be assigned an automatic ID from the HISAT2-StringTie stage, depending on how you have run these programs. Please check the merged GTF that you produce with StringTie to see if any novel transcripts have been identified.

According to a colleague, the automatic ID for novel transcripts may begin with something like 'MSTRG'.

It would help if you showed the commands that you used, and also sample outputs from different stages.

Kevin

0
Entering edit mode

Oh yes I can check that.

#! /bin/bash
strp=/home/xxx/softwares/stringtie-2.1.4
gtf=/home/xxx/Documents/sss/genome.gtf
thr=40

find ./ -iname "*.bam" >li1
do
base_name=$(basename$e .bam)
${strp}/stringtie -p${thr} -G ${gtf} -o${base_name}.gtf -l ${base_name}$e
done

find ./ -iname "*.gtf" >li2.list

${strp}/stringtie -p${thr} --merge -G ${gtf} -o stringtie_merged.gtf li2.list cat li1|while read a do base_nam=$(basename $a .bam)${strp}/stringtie -e -p ${thr} -G stringtie_merged.gtf -o${base_nam}.re.gtf $a done find ./ -iname "*.re.gtf"|while read i do aa=$(basename $i .re.gtf) printf "$aa\t$i\n" done >reli2.list #for each path, add SRR accession and space at start of line python${strp}/prepDE.py -i reli2.list


This was the total script I have used though.

1
Entering edit mode

Thanks - I believe they should be found in stringtie_merged.gtf, and then also in your counts data that is produced for DESeq2. It may additionally depend on how you ran HISAT2, though. Nice coding, by the way!

1
Entering edit mode

Yes, thankyou so much!