Entering edit mode
22 months ago
greed
▴
10
Hi there! I recently ran Roary for some bacterial strains using GFF annotations produced via Prokka. Then I examined the outputs and I noticed that some gene IDs that are present in the GFF annotation of Prokka, are actually missing in the "gene_presence_abscence.csv" produced by Roary. How's that possible? Thank you.
What kind of genes are you missing? As far as I know Roary only works with protein coding genes
In Prokka annotation, the ID is an "hypothetical protein"
That is the gene product name not the gene ID.
By the way, I had a similar issue with a different software (Anvi'o). It came out that truncated genes or genes spanning across multiple contigs were not included in the pan-genome analysis. I would manually check some of these genes and try to understand why they are missing in the pan-genome.
I'm sorry I couldn't be more helpful
Thank you, you've been helpful.