Best practices to perform a gene set enrichment analysis (GSEA) with E. coli?
0
0
Entering edit mode
3.4 years ago

I am working with several E. coli samples, for each of which I have a list of hundreds of differentially expressed genes deriving from DESeq2. Besides log 2 fold change and p-value, I also have their expression in TPM and FPKM. I also have the annotated GO terms for each gene.

With all these data I would like to perform a gene set enrichment analysis, once for each sample. Hence, I would need something that works from command line and that accepts E. coli gene names. I would much prefer if the expression levels could be used to rank genes.

The tools I know are:

  • GSEA
  • DAVID
  • GREAT
  • Enrichr
  • GOrilla
  • SetRank

I am basically a neophyte with all these tools. I know that some of them are mostly made for human or mice datasets, and some other (DAVID) are debatable in terms of reliability.

All the posts I found on BioStars about this are either not centred on E. coli or quite old (4-5 years ago). Hence, what way would you suggest me to proceed?

GSEA Escherichia coli GO terms gene set RNA-Seq • 2.1k views
ADD COMMENT
0
Entering edit mode

Based on the tools you listed, it sounds like you are looking for a web-based tool that does not require any coding? You can check a few additional options here. Most of those are going to only support specific species.

If you are comfortable with some coding, there are many R-based tools that are species-agnostic and will allow you to use arbitrary gene sets (pathways) as input. In that case, you actually have at least two separate questions: "where can I find E. coli gene sets" and "how do I input them into a particular tool".

ADD REPLY
0
Entering edit mode

Quite the opposite: I'm an experienced programmer in R/python/bash and I would like as much CLI as possible. In the second paragraph of my question I write "I would need something that works from command line" because I am not really a web server type of guy (and I have many samples to process) :)

ADD REPLY
0
Entering edit mode

Thanks for clarifying! If you are looking for an R-based tool, clusterProfiler is a good place to start. It does a few different types of analysis and includes many plotting options.

ADD REPLY
0
Entering edit mode

Isn't clusterProfiler mostly for mice and human? As far as I remember from their manual.

ADD REPLY
0
Entering edit mode

The manual has human and mouse examples. However, it can use any gene sets as input.

ADD REPLY

Login before adding your answer.

Traffic: 1839 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6