Hi, this is my first time using maker genome annotation pipeline.
I recently finished maker's first round and was surprised from the results I got (was expecting better results).
I used minimap2 to align a de novo transcriptome to the reference genome and let maker do the alignments of known Crustacean protein sequences and mRNA sequences of my specie from NCBI.
Prior to running maker I used BUSCO to evaluate my de novo transcriptome assembly and the genome (using metaeuk):
Transcriptome: C:99.6%[S:7.4%,D:92.2%],F:0.2%,M:0.2%,n:1013 Genome: C:88.5%[S:37.7%,D:50.8%],F:7.8%,M:3.7%,n:1013
I ran BUSCO on all the transcripts maker predicted to evaluate the results:
Although this is only the first round, what might cause ~160 BUSCOs missing from maker's predictions?
Can anyone please share from his experience, is it common?
Maybe I was over expecting and these are actually good first round results?
Regarding training ab initio annotation tools, would you use BUSCO as Augustus training? I have seen some tutorials which takes training sequences from mRNA annotations created in the first round (with 1000bp on each side), while others recommend filtering them (like in this: gene set filter/selection for training ab initio annotation tools ) and straight Augustus training
Thanks for consideration and help.