Gene Set Enrichment Analysis of RNA-seq data based on prokka annotated genomes
13 months ago
viktorht • 0

Hi All

I would like to conduct a gene set enrichment analysis of RNA-seq data. The experimental setup: There are bacteria of interest (let's call them A, B and C). RNA-seq data is obtained from monocultures of A, B and C and of the tri-culture ABC.

The genomes of all three bacteria are sequenced and annotated via prokka ( Both RNA-seq, genome sequencing and annotation was done by an external company, who also provided differential gene expression analysis of the genes.

My problem is that i cannot find any good ways to associate the genes into gene sets with the given annotations. Does anyone here have some tips? Preferably I would like to use gene sets based on KEGG pathways, but GO-terms or others could do as well. Below I have included an example from my annotations file (.tsv).

locus_tag ftype length_bp gene EC_number COG product

LFFBCOMC_00027 CDS 369 hypothetical protein

LFFBCOMC_00028 CDS 1488 feaB_1 COG1012 Phenylacetaldehyde dehydrogenase

LFFBCOMC_00029 CDS 1233 Outer membrane porin protein 32

Only one of the three strains are currently found in the KEGG database.

Thanks in advance!

RNA-Seq GSEA prokka COG • 343 views
13 months ago
Asaf 8.6k

Prokka will not give you a comprehensive KO mapping. You can run eggnog-mapper to associate genes to homology group and GO and KEGG.

Thank you! I've looked into it, and eggnog mapper seems to be just what I need.


