Tutorial:Training GeneMark-EP+ ab-initio tool
0
0
Entering edit mode
4.2 years ago
Juke34 8.5k

You could decide to use the evidence.gff and prothint.gff files from ProtHint.

In my case I already have an annotation (MAKER evidence-based). So I decided to use the annotation I had.

But it cannot used like that because Genemark uses Intron, start_codon, stop_codon features, which are absent from the MAKER annotation gff file.

So here are the steps to cheat successfully:

Prerequisite: AGAT

# add start and stop codons
agat_sp_add_start_and_stop.pl --gff maker_annotation.gff --fasta genome_sm.fa -o maker_annotation_startstop.gff
# add introns
agat_sp_add_introns.pl --gff maker_annotation_startstop.gff  -o maker_annotation_startstop_introns.gff 
# remove useless features
awk '{if($3=="intron" || $3=="start_codon" || $3=="stop_codon") print $0}' maker_annotation_startstop_introns.gff > maker_annotation_startstop_introns_only.gff
# replace intron by Intron (Otherwise Genemark fails)
sed -i 's/       intron  /       Intron  /' maker_annotation_startstop_introns_only.gff
# add al_score attribute with value over 0.3 otherwise intron features are thrown away
awk '{print $0";al_score=1"}' maker_annotation_startstop_introns_only.gff > maker_annotation_startstop_introns_only_al_score_tag.gff

Now you can run GeneMark:

/path/to/genemark/gmes_petap.pl \
  --evidence maker_annotation_startstop_introns_only_al_score_tag.gff \
  --training -v \
  --sequence genome_sm.fa \
  -cores 16 \
  --EP maker_annotation_startstop_introns_only_al_score_tag.gff
abinitio training genemark annotation • 1.3k views
ADD COMMENT

Login before adding your answer.

Traffic: 1849 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6