A few programs allow users to remove all but the longest isoforms from their genome annotations. But the nature of isoforms suggests that this is better done with the transcriptome, unless I misunderstand. On this thread, the questioner was directed to resources (such as the Trinity wiki) for eliminating isoforms from the transcriptome.
So my question is: When annotating a genome, does removing isoforms make more sense pre-annotation with the transcriptome, or could someone annotate a genome using an unfiltered transcriptome and later filter the final annotation?
Background: I recently asked the community for help because I found high duplicated BUSCOs in my final genome annotation (assessing the exons on transcriptome mode on BUSCO 5.2.2), whereas my BUSCO scores for my genome assembly were great with only a small percent of duplicates. Helpful folks suggested tools to remove short isoforms from my annotations, but looking into it alerted me that the BUSCOs of my transcriptomes had a high proportion of duplicates.
So I wonder whether I need to go back and fix the transcriptome and then redo the annotations, or if removing isoforms post-annotation suffices. Any ideas are much appreciated.