Question: genemark-es gene-finding with reference and annotation
I have generated a new fungal assembly on a species that has been assembled before, and I am trying to translate the annotations from the old assembly to the new assembly using genemark-es within quast. The quast manual says "If a gene file is provided with -G as well, both # genes in the file covered by the assembly, and # predicted genes are reported." So it sounds like quast can link the genes in the publicly available annotation to predicted genes in my assembly, yet I cannot find any output saying which predicted gene is which with regard to the annotation. Is there anything I can do within quast to get the pre-existing annotations to show up in the predicted genes? I am wondering if there is something formatted incorrectly in my input gff3 file. The 3rd column says "gene", and the attributes are "Name", "locus_tag" and "gene".

i should add that I expect the genomes to have lots of similarity. I previously mapped my illumina reads to the old assembly with ~90% mapping rate.

I would greatly appreciate the help!

I don't have much experience with using genemark-es and quast to 'transfer' annotations form one genome version to the other. However i think there are other (more suitable) approaches to accomplish this.

have a look at RATT and/or liftover (with this one you will need to be able to link your new assembly to the old one). they usually do a pretty good job in transferring annotations.

Do be aware of the 'risks/pitfalls' of such an approach ;-)

thanks for your reply. I think I was misinterpreting the quast manual--i think that after I liftover an annotation so it corresponds to the new assembly, i could feed both the new assembly and the new annotation into quast and to report those genes plus new genes that aren't in the new annotation file.

i'm looking into liftOver and CrossMap right now. Unfortunately, the vast majority of reference annotations are going unmapped to the new assembly. do you know how i can get more information about the risks/pitfalls? how can i assess how good the alignment between assemblies in my chain file?

my assemblies may have some major differences because they are a unicellular fungus, but i can see from the syntenic plot that there are long contiguous sequences preserved in the new assembly. i would expect more than 2% of genes lifting over!

Well, you just described pitfall #1 : it might not be straightforward to link your old to the new assembly. (I think RATT works on a gene basis and is less influenced by this issue) .

If you can see obvious synteny between both assemblies you would indeed expect more genes to liftover. Perhaps something is off with your chain file?

