I have the following gtf file layout, with the 'features' transcript (i.e. full length of the transcript) and the exons within that transcript. For example:
C7123483 cam transcript 1 8268 . + . gene_id "00001"; transcript_id "00001"; C7123483 cam exon 1 206 . + . gene_id "00001"; transcript_id "00001"; C7123483 cam exon 263 749 . + . gene_id "00001"; transcript_id "00001";
Since this file only contains the coordinates for the exons, I would also like this file to include the intron coordinates. Presumably I would have to subtract the end coordinate of the previous exon from the start coordinate of the next exon. Has anyone got any experience doing this - are there any tools to do this automatically as I am struggling to write a script?
I need to find the exon/intron coordinates as I have another bed file whose coordinates I need to match with the exon/intron/trasncript_id/gene_id information from the gtf file.
I hope this makes sense - I am very new to bioinformatics, and any help would be very much appreciated.