Glimmer output file
0
0
Entering edit mode
6.2 years ago
bitpir ▴ 240

Hi,

I am not quite understanding the output of the .predict file from Glimmer3.02 ORF predictor.

Here's a sample of the output file

ref|NC_023013.1| Haloarcula hispanica N601 chromosome 1, complete sequence
orf00001        1     1575  +1    18.88
orf00003     2355     1645  -1    12.87

According to the documentation, column1= ID, column2=start of gene, column3= stop of gene, column4=reading frame, column5=The per-base “raw” score of the gene.

My questions are:

  1. to calculate the ORF score (100*log-odd ratio) of the gene, do I multiply column5 by the length of the gene?
  2. Is there a good threshold (either for column 5 or the calculated score) to see if the predicted ORF is likely to be true?

Thanks for the help!

ORF prediction Glimmer3 • 2.1k views
ADD COMMENT
0
Entering edit mode

My question is: 1. 2.

I see more than one question :-)

ADD REPLY
0
Entering edit mode

Good catch! Thought of another question but forgot to change the grammar! :)

ADD REPLY
0
Entering edit mode

a bit pragmatic maybe, but all really depends on how you run glimmer.

Plain glimmer3 predictions often are an underprediction and don't get the start codon right. the included iterative workflow creates a first model, determines a PWM on the most likely Shine-Dalgarno site and a better estimate of the start codon distribution and reruns glimmer using this information. The resulting gene model is far more accurate than the initial one.

ADD REPLY
0
Entering edit mode

Thanks for the info. About the iterative workflow, I often run into a problem of generating PWM. It works for some files but not others. Wonder if this is common?

ADD REPLY

Login before adding your answer.

Traffic: 1367 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6