Question: GeneMarkS mod file and questions
5.9 years ago
United States
leebo07310 wrote:

Hi All,

I'm a newbie starting to use gene prediction software & am trying to understand how GeneMarkS works. I was able to process my dataset using GeneMarkS software and was going through intermediate files. (mod file, lst, faa files) Can I get some help on understanding formats & how it works? 

  1. What does COD1/COD2 in .mod file means ? COD  seems like  it is for coding region and NONC for non-coding region & numbers in each section represents transitional probabilities for HMM. Is that correct ? 
  2.  why are there 64 rows for each COD1/COD2/NONC sections?  Is this for each codon (4x4x4)  per row? If so, does anyone know how they distinguish/sort codons in that file? 
  3. Can anyone help me understanding what does native/heuristic model parameter means & how GeneMarkS combines them? 
  4. I am seeing predicted ORF sequence does not always start with "Methionine". (AUG) Would there be any reason for this? 

I was trying to answer these questions by going through the papers, but hasn't got any luck so far.  Any help/guidance in answering them will be greatly appreciated! 




