Hi all,
I already have annotated my plant genome using MAKER. I used some scripts from MAKER to recover the annotations in GFF3 format and transcripts and proteins in a fasta file. Nevertheless, I used quality_filter.pl script to clean those genes models that could result as false positive and I created a new GFF3 file to get only the gene models with an AED score of <0.5. Do you know a script to recover the proteins and transcripts from this new GFF3 file with only the good models?
Ivan PhD student UNAM
Thank you for your answer, do you have to install GASS to run the script? or just copy the script?
best
I am trying and I think all is running good
Btw as you can run tools from that repo I suggest you use maker_merge_outputs_from_datastore.pl to Merge output from Maker.
Hi Juke,
MAKER has a very similar script gff_merge.pl) to merge all the outputs. BTW I have a question, do you know why my proteins sequences have a * at the end of the sequence?
Try both and you will see the difference ;). The main reason I prefer maker_merge_outputs_from_datastore.pl beside the fact that the result is more complete (It collects all tracks not only the maker annotation, it collects proteins and transcripts too in one go, it backups the .ctl files because we usually lost that information, it computes statistics of the annotation ) the script looks directly in the genome_datastore folder while gff_merge passes by the master_datastore_index.log file. This is important because I have seen many times when using MPI that the master_datastore_index.log file was ´corrupted´, consequently some annotations were missing.
Btw from GAAS you could also use the maker_check_progress.sh tool to be sure that your annotation has really completed successfully. Just launch it in the folder where you have run MAKER.
I already have used both scripts, and the script from GAAS is terrific, I got a lot of information. There is a part where it says about the gene coverage percent, in my case the number was 10. Does this mean that the genes cover only 10 % of the total genome size?
Best
Hi imda, Yes it does.
They are stop codons. MAKER include stop codon within the CDS. You could get rid of them by using ‘- -cfs’ option while using the script (stand for: Clean Final Stop).