For example, see this gene (nad1) in ENA: http://www.ebi.ac.uk/ena/data/view/ABI60879
If you look at the XML for that gene you see the following:
join(
DQ984518.1: 324706 .. 325091 ,
complement(DQ984518.1: 24417 .. 24498),
complement(DQ984518.1: 22828 .. 23019),
DQ984518.1: 3484 .. 3542 ,
complement(DQ984518.1: 153702 .. 153960)
)
Which shows 5 exons joined out of phase and out of order. Is there a valid GTF representation of this?
How to dump a 'non-canonically spliced' gene into GTF? i.e. what's the recommendation?
Also, how to verify that the resulting GTF is valid? Compare the translation?
Hello Dan!
It appears that your post has been cross-posted to another site: https://bioinformatics.stackexchange.com/questions/849/how-to-represent-trans-spliced-genes-in-gtf/855#855
This is typically not recommended as it runs the risk of annoying people in both communities.