Question: How To Build A Custom Model File In Genemark?
4
gravatar for Panos
9.6 years ago by
Panos1.7k
Geneva, Switzerland
Panos1.7k wrote:

I want to use GeneMark (2.7d) to do gene prediction in a soup of sequences. I don't want, however, to use the general bacterial/archaeal model. Instead, I want to create a custom model file. While this is easy in glimmer (just select a bunch of long orfs, for example), I cannot find out how to do it using GeneMark (it's probably the 'probuild' program but don't know anything else).

Edit: I downloaded GeneMark from GeneMarkS - Linux64. After expanding the zipped file, there's a program called 'probuild' that you use for building custom models but the documentation is really poor (or "hidden" somewhere I cannot easily find!). The contents of one of the prebuilt model files are like this:

PHMM 2.5
NAME Aeropyrum_pernix
ORDM 2
ATG_ 0.298
GTG_ 0.279
TTG_ 0.423
CTG_ 0
TAA_ 1
TAG_ 1
TGA_ 1
MINC 40
MAXC 12000
MAXN 12000
NDEC 150
CDEC 300
CDCD 0.0
CD1P 1
CD2P 1
COD1 
0.00780 0.00540 0.00895
...Lots of numbers.....
COD2
...Lots of numbers.....
NONC
...Lots of numbers.....
RBSM
0.132    0.167    0.431    0.270
...Some more numbers...
RBSL 34

RBSD
0.016    0.008    0.024    0.032    0.12    0.174    0.128    0.086    0.094    0.08    0.03    0.022    0.012    0.012    0.012    0.006    0.002    0.006    0.01    0.01    0.008    0.012    0.004    0.008    0.008    0.008    0.002    0.012    0.002    0.012    0.004    0.002    0.01    0.01
prediction model gene • 5.0k views
ADD COMMENTlink modified 15 months ago by RamRS25k • written 9.6 years ago by Panos1.7k
1

can you post a link to the program and maybe an example of the file you want to generate?

ADD REPLYlink written 9.6 years ago by Paulo Nuin3.7k
5
gravatar for Daniel Swan
9.5 years ago by
Daniel Swan13k
Aberdeen, UK
Daniel Swan13k wrote:

GeneMarkS will do this for you - in fact this is what GeneMarkS is designed to do - self train it's own ORF prediction on an anonymous genome.

To generate the GeneMark.mat files simply run something like:

gmsn.pl -euk your_genome.fasta

(the -euk switch is great for intronless eukaryotes, but probably not for prokaryotes)

Then try

gm -m GeneMark.mat -R  -lo -op your_genome.fasta

The output options are of course, up to you - I like an ORF output so I can run genemark2artemis on the output easily.

ADD COMMENTlink modified 15 months ago by RamRS25k • written 9.5 years ago by Daniel Swan13k

Thanks a lot Daniel! I checked gmsn.pl and it appears to be a --prok option, too. When I use it, I get both the mat and the mod files. What's the difference between them? Is *mod intended for use with prokaryotes?

ADD REPLYlink written 9.5 years ago by Panos1.7k

The mat is definitely the gene model file. I think it is directly converted from the mod file (an HMM profile I assume) by mkmat (after the probuild step. Because this is all abstracted away by gmsn, this is just a hunch!

ADD REPLYlink written 9.5 years ago by Daniel Swan13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1481 users visited in the last hour