Question: Braker output into maker (Annotation)
gravatar for mafireyi
4.8 years ago by
South Africa
mafireyi80 wrote:

Good day. I am new to genome annotation. I am running maker on a new genome and am planning on using Braker to train augustus and gene mark. I have already used my tophat alignment of my mRNA-seq data to the genome to run braker. My question is how then do I use the braker output in maker? Is there a maker script that converts the gtf files produced by braker into a format accepted by maker in the CTL files? Thank you. Will appreciate any help

assembly • 5.7k views
ADD COMMENTlink modified 4.8 years ago • written 4.8 years ago by mafireyi80
gravatar for Philipp Bayer
4.8 years ago by
Philipp Bayer6.9k
Philipp Bayer6.9k wrote:

BRAKER1 uses GeneMark-ET to train and evaluate AUGUSTUS, and the final gene predictions are from AUGUSTUS.

You can do two things: First, you can use the model that BRAKER creates in the AUGUSTUS config dir in MAKER's AUGUSTUS run. All you have to do is to specify the species name you gave to BRAKER in MAKER's maker_opts.ctl. For example, you ran BRAKER like this:  --genome=genome.fasta --bam=reads.bam --species=my_species

This will create a folder named my_species in the config dir of AUGUSTUS. If you use the same AUGUSTUS in MAKER all you have to do is enter your species in MAKER's maker_opts.ctl, and AUGUSTUS will find it again:

augustus_species=my_species #Augustus gene prediction species model

Alternatively, you can treat the BRAKER annotation as "legacy annotation" in MAKER. In that case, you can use the final .gff from BRAKER1 as pred_gff in MAKER's maker_opts.ctl, and MAKER will edit the exons and introns based on the external evidence you give it. There is an example for this in this protocol.

I have no idea which of the two approaches is "better". If you need more help with running MAKER/BRAKER, the Supplementary Materials from the BRAKER publication has many useful commands:

Later edit: One caveat you will run into is that MAKER performs repeat masking while BRAKER does not, so you're (especially with plant species) bound to see more false-positive transposon-related genes with BRAKER if you don't run repeat masking first.

ADD COMMENTlink modified 4.8 years ago • written 4.8 years ago by Philipp Bayer6.9k

If you use the first approach (simply using the species parameters produced by BRAKER in MAKER), you are going to loose the RNA-Seq information in the gene prediction step with AUGUSTUS. MAKER can incorporate RNA-Seq, too, but it does it in a different way that is not optimal for AUGUSTUS (at least as far as I am aware, unless there was a serious update that escaped my attention). I therefore recommend the legacy annotation path. This way, you can also pass the GeneMark-ET predictions from BRAKER to MAKER.

From one of the BRAKER developers.

ADD REPLYlink written 3.3 years ago by katharina.hoff70

Hi Katharina, are there any arguments against doing both?

MAKER does apparently use hints to improve on the predictions, and I think I have seen numbers (old ones though) stating that AUGUSTUS within MAKER perform better than AUGUSTUS outside. This was likely not with RNA-seq data involved.

If you use both the augustus_species and pred_gff both predictions will compete against each other in a way, and MAKER will chose the one best fitting the evidence.

ADD REPLYlink written 3.2 years ago by o.k.torresen0

Please check with the MAKER developers whether MAKER would be able to select the best prediction based on evidence if you do this. It might be problematic if MAKER cannot weight the evidence that was used by BRAKER, correctly.

ADD REPLYlink written 3.1 years ago by katharina.hoff70

Hi, Augustus was not able to use the new species produced by Breaker. do you have any idea why?

Cheers Luigi

ADD REPLYlink written 4.4 years ago by luigi.faino20

Maybe you have two different AUGUSTUS_CONFIG_PATHs?


Either make those two paths identical before running BRAKER, or copy the parameters to the correct location:

cp -r $AUGUSTUS_CONFIG_PATH1/species/yourspecies $AUGUSTUS_CONFIG_PATH2/species

ADD REPLYlink written 3.1 years ago by katharina.hoff70

Just noticed that now too. Have you found a solution for that?

ADD REPLYlink written 4.2 years ago by mafireyi80

Two ideas:

1) check whether the environment variable that MAKER's augustus uses is set for the right path where BRAKER wrote to -


2) check whether you're allowed to write to the AUGUSTUS_CONFIG_PATH

ADD REPLYlink modified 4.0 years ago • written 4.0 years ago by Philipp Bayer6.9k

Thanks for the explanation. I read in this paper Four arguments for not masking your genome before annotation. Would you advise in that direction?

ADD REPLYlink written 5 months ago by eennadi0

This would be its own good question but yeah, right now I lean towards not running RepeatMasker via MAKER, but instead later remove proteins tha tare X% covered by known repeats or contain transposase related domains

ADD REPLYlink written 5 months ago by Philipp Bayer6.9k
gravatar for mafireyi
4.8 years ago by
South Africa
mafireyi80 wrote:

Thank you very much, this has been very helpful

ADD COMMENTlink written 4.8 years ago by mafireyi80
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2442 users visited in the last hour