Incorrect scaffold names GeneMark-ES
1
0
Entering edit mode
6.1 years ago

Hey everyone,

I've recently used GeneMark to de novo predict genes in an assembly I'm working on. I installed GeneMark-ES / ET v.4.33 .

The run generated a GTF, however the first column is using dufferent scaffold names to my own. My scaffolds are simply labelled "scaffold1, scaffold2, scaffold3 etc.". An example of the GeneMark-ES GTF output is below:

1_dna GeneMark.hmm exon 137 283 0 + . gene_id "1_g"; transcript_id "1_t"; 1_dna GeneMark.hmm CDS 137 283 . + 1 gene_id "1_g"; transcript_id "1_t"; 1_dna GeneMark.hmm exon 307 344 0 + . gene_id "1_g"; transcript_id "1_t"; 1_dna GeneMark.hmm CDS 307 344 . + 1 gene_id "1_g"; transcript_id "1_t"; 1_dna GeneMark.hmm exon 371 543 0 + . gene_id "1_g"; transcript_id "1_t";

When Ideally I'd want something like (note I don't know if 1_dna maps to scaffold 1 - this is just an example):

scaffold1 GeneMark.hmm exon 137 283 0 + . gene_id "1_g"; transcript_id "1_t"; scaffold1 GeneMark.hmm CDS 137 283 . + 1 gene_id "1_g"; transcript_id "1_t"; scaffold1 GeneMark.hmm exon 307 344 0 + . gene_id "1_g"; transcript_id "1_t"; scaffold1 GeneMark.hmm CDS 307 344 . + 1 gene_id "1_g"; transcript_id "1_t"; scaffold1 GeneMark.hmm exon 371 543 0 + . gene_id "1_g"; transcript_id "1_t";

My command was simply:

gmes_petap.pl --ES --cores 6 --sequence assembly.fasta

Any tips/insights into what might be happening would be appreciated.

Cheers

annotation ab initio prediction • 1.2k views
ADD COMMENT
0
Entering edit mode
6.1 years ago

Digging around I found file dna.trace in the GeneMark generated info directory. This contains mapping information between the new scaffold names (i.e 1_dna) to the original scaffolds.

Not sure why the output didn't map them back to the originals, but the information is there.

ADD COMMENT

Login before adding your answer.

Traffic: 3429 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6