Question: Tool: Exon file creation for GlimmerHMM training set
gravatar for tanvi.phaltane
3.9 years ago by
tanvi.phaltane70 wrote:

I want to carry out gene prediction for fungus Cochliobolus sativus isolated strain. As there is no fungal training model available in GlimmerHMM, I am creating one using C.sativus ND90Pr, C.victoriae, C.miyabeanus ATCC 44560 v1.0, C.lunatus, C.heterostrophus, C.carbonum genomic data from JGI. When I execute trainGlimmerHMM <multifasta_file> <exon_file> I get an error for specific lines in my dummy exon file. According to my observation, the error occurs for reverse strand lines only. As mentioned in its README file, I have separated them with a blank line and also mentioned the co-ordinates in descending order. I get an error ERROR 27: Wrong exon coordinates file. Exon file line: scaffold_0 exon 3002 2420

Below is the dummy exon file

scaffold_0 3002 2420
scaffold_0 2422 2420
scaffold_0 3933 3078
scaffold_0 4219 3995
scaffold_0 4304 4267
scaffold_0 4397 4357
scaffold_0 4699 4450
scaffold_0 5213 5115
scaffold_0 5575 5264
scaffold_0 5724 5633
scaffold_0 5812 5778
scaffold_0 5921 5864
scaffold_0 5921 5919

scaffold_0 6144 6190
scaffold_0 6144 6146
scaffold_0 6247 6394
scaffold_0 6452 6598
scaffold_0 6596 6598

scaffold_0 7222 7310
scaffold_0 7222 7224
scaffold_0 7365 7461
scaffold_0 7526 7927
scaffold_0 7925 7927

scaffold_0 8253 9230
scaffold_0 8253 8255
scaffold_0 9228 9230

If I run the 'train' command only for forward strand exon co-ordinates, training set is created successfully. Can anyone please point out where I am going wrong?

ADD COMMENTlink modified 3.9 years ago by Avi70 • written 3.9 years ago by tanvi.phaltane70
gravatar for Avi
3.9 years ago by
United States
Avi70 wrote:

Can you check the length of scaffold_0 in multifasta_file file?

ADD COMMENTlink written 3.9 years ago by Avi70

The length of scaffold_0 is 870365 bases.

ADD REPLYlink written 3.9 years ago by tanvi.phaltane70

Error is generated most probably from this file:

Search for ERROR 27: Wrong exon coordinates file. Exon file line I am not very good at perl so can't say much but my ($anum,$ex1,$ex2)=/^(\S+)\s*([\>|\<]*\d+)\s*([\>|\<]*\d+\s*)$/;

In this line either anum or ex1 or ex2 has not been set properly.

Hope it helps somehow.

ADD REPLYlink written 3.9 years ago by Avi70
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1045 users visited in the last hour