We have paired-end Illumina RNASeq reads and we are working with a non-model organism with no reference genome. We have a working composite for a protein sequence that includes every exon we have found via cDNA. We have 6 muscle types with some triplicates and want to see how many times 4 specific exons that look to be alternatively spliced are present in each muscle type.
For example, muscle type a has this exon expressed 46% while muscle type b only expresses this exon 12% of the time.
I figured I could extract the exon sequences individually and then align within HISAT2 and feed into stringtie for abundance counts. That way we are only including the transcripts aligned to that specific exons. I'm not looking for differential expression, only a number of how many times this exon is found within the muscle type's transcript file.
Following stringtie I was thinking of putting the HISAT2 BAM file and GTF file (from stringtie) in something like htseq-count or featureCount. Is something like this doable? It's really only for a figure that we're trying to construct showing how many times a transcript within a muscle type maps to a certain exon within one protein sequence.