Identify jumbled transcripts (scrambled exons) using HISAT2, stringtie, and GFFcompare?
1
0
Entering edit mode
5.0 years ago

I have some PacBio RNA-seq data that should have a jumbled gene in it (e.g. the exons are not in the canonical order but instead go something like 1, 2, 3, 5, 6, 4, 5, 7, 8 etc - scrambled exons). I thought that by mapping my FASTQ with HISAT2 followed by mapping the resulting .bam to the reference GTF that I would see this jumbling event in the resulting GTF for my BAM - but nothing - the codes are all "=" for this gene when I do a GFFCompare. If I open the BAM in IGV I see the jumbling event, but what I'm looking for is a way of find other jumbling events that I don't already know about. Any suggestions?

RNA-SEQ hisat2 stringtie gffcompare • 1.2k views
ADD COMMENT
0
Entering edit mode

If you have PacBio data why are you using HISAT2? A proper long read aligner (that can not only accommodate the error profile plus the length) like minimap2 would be a much better choice.

ADD REPLY
0
Entering edit mode

Okay, that makes sense. But what about the downstream pipeline after minimap2? Was I right in assuming that stringtie and GFFcompare should show me jumbled/scrambled exons?

ADD REPLY
0
Entering edit mode

Are your individual reads long enough that they will span these shuffled exons and also give you read depth to generate confidence (number of reads aligned) in the alignments? You will have to carefully examine the alignments to see how minimap2 aligns the reads.

Just to clarify. If HISAT2 pipeline has produced results that make sense to you then great. I am just saying that it would be useful to examine what minimap2 does in addition.

ADD REPLY

Login before adding your answer.

Traffic: 2665 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6