Hi, I have a question regarding the intronic region in transcript data. After the alignment step using Hisat2 and transcriptome assembly and reconstruction step using StringTie, I get 5 coverage files using ballgown with -e parameter that are ( e_data.ctab , i_data.ctab , t_data.ctab, e2t.ctab and i2t.ctab ). I want to know why we get intron files (i_data.ctab and i2t.ctab) from transcriptome data(which has processed transcripts sequences).
in ideal theoretical circumstances you would indeed not expect to have introns in your transcript assembly. However, biology is not ideal in real life situations. As such it is possible that some transcript that you have sampled are not yet fully processed (== intron have been spliced out) and will this still contain some introns as well.
This should be a rather low fraction in the whole sample, if not than likely something is wrong with your samples (or you sampled a very specific situation)