Hi everybody,
I have had a de novo assembly, and now I want to get the annotation, I have the predicted genes from geneMarks, but, somebody knows what I have to do for validate that genes? How can I do that? from coverage for the predicted genes? And, after that, does exist any program (linux command line) for creating gff file? or I must mix output from blast format 6 with gff from gene prediction?
Thanks a lot! for suggestions!
You should have a look at this paper :
Mark Yandell & Daniel Ence Nature Reviews Genetics 13, 329-342 (May 2012) doi:10.1038/nrg3174
There is plenty of tool for genome annotation, Breaker, MAKER, PASA, etc.
If you want to use abinitio tool as you did, feed them with evidence like proteins or transcripts, it will give you much better result. (In your case use genemark_ES_P).
To validate them it's a harder task. You can keep those that have similarity with other sequences in DB (protein or transcript), keep those that have known domain.
Using MAKER could facilitate this task. It will add a score to your genemark prediction based on evidence you fed MAKER with. More aligned sequences agree with the prediction (proteins or transcripts/EST) more your prediction is most likely.
Thanks! it really helps me!