Question: Genome annotation with BRAKER: how to interpret the results?
gravatar for svitlana.lukicheva
13 months ago by
Brussels, Belgium
svitlana.lukicheva10 wrote:


I am new to the genome annotation and I'm lost with the interpretation of results produced by BRAKER.

  • I have a de novo assembly of an insect genome (N50 = 350kb, length = 1.9 Gb).
  • I masked the repeats using RepeatModeler and RepeatMasker.
  • I mapped the RNA-Seq data from the same species to the (hard) masked genome with HISAT2.
  • I used BRAKER to annotate my (soft masked) genome with the bam file produced by HISAT2.

I have 54000 entries in the resulting augustus.hints.gff file. That means that Augustus predicted 54k genes, right? We expect to have between 10k and 20k genes for our species, so I would like to understand why there are so many genes in our prediction.

Among these 54k entries, 38k entries contain the following information:

# % of transcript supported by hints (any source): 0

Does it mean that these predictions are of poor quality and I should only keep predictions with a significant %?

Any other suggestions on how to enhance the annotation of my genome are welcome!

braker annotation • 523 views
ADD COMMENTlink modified 13 months ago • written 13 months ago by svitlana.lukicheva10

that can not be the only output file, no? Can you check what the numbers in the fasta (output) files are?

also: what was the exact command you executed?

ADD REPLYlink modified 13 months ago • written 13 months ago by lieven.sterck8.2k

Thank you for your reply!

I also have a fasta file with AA and another with coding sequences, both containing 54 k genes.

The command I executed is: --cores 16 --species=mySpecies --genome=genome_softmasked.fa --bam=rnaseq_masked_sorted.bam --softmasking --gff3
ADD REPLYlink written 13 months ago by svitlana.lukicheva10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1382 users visited in the last hour