Hello! I am new into Python and I have a project - I have to extract introns of all genes from big gff3 file. So the input looks like this (i cut the last cell with adnotations to make it clearer):
Chr1 phytozome8_0 gene 11218 12435 . + . Chr1 phytozome8_0 mRNA 11218 12435 . + . Chr1 phytozome8_0 five_prime_UTR 11218 11797 . + . Chr1 phytozome8_0 CDS 11798 12060 . + 0 Chr1 phytozome8_0 CDS 12152 12317 . + 1 Chr1 phytozome8_0 three_prime_UTR 12318 12435 . + .
and I want to get input like: whole first line with gene information, next the number of introns (which is number of CDS-1) and the lenght of introns which is difference between end of one CDS and start of another (in this case 12152-12060). Is there any possibility to achieve what I want with some Python script?