how to fetch intron coordinates from transcriptome.gtf file ? is there any direct tool ?
Hello

Is there any direct tool to fetch intron coordinates from transcriptome.gtf file ?

Is there any direct tool available ?

Thank you

It’s simple to parse. Simply take the coordinates annotated as .gene transcript and those annotated as exon. Take the complement, so gene transcript minus exon, that’s the introns.

If you want the exact intron ranges you would need to modify the above by looking per transcript. Doing it per gene would give you regions contained within the introns of all transcripts, but not proper introns per say.

i have done intergenic minus exon for intron coordinates, that is wrong ?

In a sense, yes it is wrong because it doesn't capture biological complexity of alternative splicing. If you have simple gene models where each gene has a single annotated transcript, then it would suffice, but only because the gene models do not capture the real biology well.

Imagine these hypothetical transcripts for the same gene:

T1: oo     t1e1]--------[        t1e2]----------[  t1e3  ooo>
T2: ooo    t2e1]--------[    t2e2]-----[ t2e3]--[ t2e4 oo>

[  ]: internal exon boundaries
--: intronic sequence


If you look at genes in a genome browser you will find even more complex cases. Therefore, it becomes very clear that introns depend on the actual transcript.

If GFF/GTF output format is ok you can use agat_sp_add_introns.pl from AGAT

