Question: Best practices to perform a gene set enrichment analysis (GSEA) with E. coli?
0
gravatar for Macspider
7 weeks ago by
Macspider3.3k
Vienna - BOKU
Macspider3.3k wrote:

I am working with several E. coli samples, for each of which I have a list of hundreds of differentially expressed genes deriving from DESeq2. Besides log 2 fold change and p-value, I also have their expression in TPM and FPKM. I also have the annotated GO terms for each gene.

With all these data I would like to perform a gene set enrichment analysis, once for each sample. Hence, I would need something that works from command line and that accepts E. coli gene names. I would much prefer if the expression levels could be used to rank genes.

The tools I know are:

  • GSEA
  • DAVID
  • GREAT
  • Enrichr
  • GOrilla
  • SetRank

I am basically a neophyte with all these tools. I know that some of them are mostly made for human or mice datasets, and some other (DAVID) are debatable in terms of reliability.

All the posts I found on BioStars about this are either not centred on E. coli or quite old (4-5 years ago). Hence, what way would you suggest me to proceed?

ADD COMMENTlink modified 12 days ago by Biostar ♦♦ 20 • written 7 weeks ago by Macspider3.3k

Based on the tools you listed, it sounds like you are looking for a web-based tool that does not require any coding? You can check a few additional options here. Most of those are going to only support specific species.

If you are comfortable with some coding, there are many R-based tools that are species-agnostic and will allow you to use arbitrary gene sets (pathways) as input. In that case, you actually have at least two separate questions: "where can I find E. coli gene sets" and "how do I input them into a particular tool".

ADD REPLYlink written 7 weeks ago by igor12k

Quite the opposite: I'm an experienced programmer in R/python/bash and I would like as much CLI as possible. In the second paragraph of my question I write "I would need something that works from command line" because I am not really a web server type of guy (and I have many samples to process) :)

ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by Macspider3.3k

Thanks for clarifying! If you are looking for an R-based tool, clusterProfiler is a good place to start. It does a few different types of analysis and includes many plotting options.

ADD REPLYlink written 7 weeks ago by igor12k

Isn't clusterProfiler mostly for mice and human? As far as I remember from their manual.

ADD REPLYlink written 7 weeks ago by Macspider3.3k

The manual has human and mouse examples. However, it can use any gene sets as input.

ADD REPLYlink written 7 weeks ago by igor12k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2272 users visited in the last hour