I have ~10 genomes from a mammal subfamily. Contiguity of the genomes differs by two orders of magnitude (N50 ranging from 30K to 12M). Some are annotated, some are not, and not all have RNAseq data available to annotate from scratch. I would like to study gene family evolution in the clade: what would be the best pipeline? Perhaps a good start would be a CACTUS type WG alignment, followed by propagating the annotation from the best annotated genome? I understand that OrthoFinder requires protein sequences for all the species - but many of my genomes are not annotated and thus no protein sequences available.
Suggestions will be very welcome!