Good day. I am new to genome annotation. I am running maker on a new genome and am planning on using Braker to train augustus and gene mark. I have already used my tophat alignment of my mRNA-seq data to the genome to run braker. My question is how then do I use the braker output in maker? Is there a maker script that converts the gtf files produced by braker into a format accepted by maker in the CTL files? Thank you. Will appreciate any help
BRAKER1 uses GeneMark-ET to train and evaluate AUGUSTUS, and the final gene predictions are from AUGUSTUS.
You can do two things: First, you can use the model that BRAKER creates in the AUGUSTUS config dir in MAKER's AUGUSTUS run. All you have to do is to specify the species name you gave to BRAKER in MAKER's maker_opts.ctl. For example, you ran BRAKER like this:
braker.pl --genome=genome.fasta --bam=reads.bam --species=my_species
This will create a folder named
my_species in the config dir of AUGUSTUS. If you use the same AUGUSTUS in MAKER all you have to do is enter your species in MAKER's maker_opts.ctl, and AUGUSTUS will find it again:
augustus_species=my_species #Augustus gene prediction species model
Alternatively, you can treat the BRAKER annotation as "legacy annotation" in MAKER. In that case, you can use the final .gff from BRAKER1 as
pred_gff in MAKER's maker_opts.ctl, and MAKER will edit the exons and introns based on the external evidence you give it. There is an example for this in this protocol.
I have no idea which of the two approaches is "better". If you need more help with running MAKER/BRAKER, the Supplementary Materials from the BRAKER publication has many useful commands: http://bioinformatics.oxfordjournals.org/content/suppl/2015/11/09/btv661.DC1/supplementary.pdf
Later edit: One caveat you will run into is that MAKER performs repeat masking while BRAKER does not, so you're (especially with plant species) bound to see more false-positive transposon-related genes with BRAKER if you don't run repeat masking first.