I need to compare old 3' tag RNA-seq data where only the last couple of exons are sequenced to modern full gene RNA-seq data.
To make things slightly more comparable I want to only count reads mapping to the last 4 or so exons per transcript.
Does anyone know a tool which can filter GTFs or GFF3 to only include the last exons of a transcript, or will I have to write something myself ? I know quite a few tools, but none have this (admittedly weird) functionality.
Edit - we wrote a Python script here to cover this, thanks Fabian. None of the simple approaches suggested here worked.