How To Compare Genome Annotation Tools Reliability By Evaluating A Genome Of An Organism?
2
2
Entering edit mode
11.2 years ago
spartans300 ▴ 20

So, I've been doing a investigation where I am to investigate the output of 3 annotation tools; Blast, Priam and SEED. They have all produced their own EC numbers for the organism Streptococcus agalactiae. What I want to know is how would I find which one of them is the correct annotation. For example Blast and SEED produced that at a specific point, the enzyme is EC:4.1.1.21 but Priam said that it would be EC:5.4.99.18. I need to see which one is correct without being able to actually do the manual annotation myself. What should I do? Thanks.

genome-annotation • 4.1k views
ADD COMMENT
1
Entering edit mode

Technically, the only way to know which annotation is correct is wet-lab work: purify the protein and assay its enzyme activity. I think what you want to do is understand the annotation process in each case and make a good argument for which is more likely to be correct.

ADD REPLY
0
Entering edit mode

Ok thanks, yes thats exactly what i want to do, so what kind of properties do you think i should look for. I have been searching swissprot and metacyc to see which one would best match the fit but I'm not sure on what i'm looking for that would give a best indication.

ADD REPLY
0
Entering edit mode
11.2 years ago

Could you use something like the Annotation Edit Distance to compare methods?

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2653490/

EDIT:

Here is another example of how AED is used to compare annotation sets via the Maker2 suite.

http://www.biomedcentral.com/1471-2105/12/491

ADD COMMENT
0
Entering edit mode
11.2 years ago
spartans300 ▴ 20

I'm not sure how to from that paper since they look at the differences between the releases of annotation tools but i need to find which of the given annotations is the correct one and why.

Can't post for another 6 hours so will edit this response:

Ok sorry, I'm looking for why BLAST predicted a specific EC number for Streptococcus agalactiae, and at the some position on the gene, PRIAM predicted a different EC number and which one is right. AED seems very specific to the people in that paper and i think it might be out of my knowledge to be able to produce something similar. Also it picks up changes from previous releases whereas i'm looking at the differences between the latest predictions, in different annotation tools and why they have come about. thanks so far for your help.

ADD COMMENT
0
Entering edit mode

AED is a way to measure which gene/ feature model is a better fit for the data. This will serve your purpose, unless I don't understand your question. Also: Please use comments rather than "Add your Answer".

ADD REPLY

Login before adding your answer.

Traffic: 2955 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6