Greetings,
I have been using RAST for bacterial genome annotation for about 1 year. In this time I've only worked with a few genomes, but now I am working with a huge list of bacterial genomes (precisely 108 genomes). What I want to know is: is there any way that I can submit all these 108 genomes (Genbank files) to the RAST server at the same time? Submitting one by one would be really time consuming and I would do that only if there is really no other option. The subsequent program (PanWeb) will be used for pan-genome analysis and it requires EMBL files standardized by RAST as input files. Do you have any suggestion on how I can manage to process these files? Maybe using another annotation server? Maybe another strategy?
Many thanks!
Alec
Thanks for your reply! I've been reading about the command line option of RAST and it will for sure facilitate the process shall I use it. If I were to switch to prokka, do you recommend any software for pan-genome analysis? I'd need to test it, but I'm not completely sure that PanWeb would accept the output file of prokka, although it does accept gbk.
The program Roary is designed to take the output of prokka for several genomes and produce core genome/accessory genome analyses. I can’t remember off the top of my head whether it gives you back the pangenome specifically, but that information would be extractable if not regardless (there are other tools that escape me at the moment that should do it too).
PGAP is best for pangenome analysis, and RASTtk surely can proceed with several genomes. PROKKA is also best like RASTtk for annotation.All the best.
I’ve not used PGAP, but it is older and less highly cited than
roary
. I’d want to see some specific performance indicators to suggest PGAP is “best”.