Python implementation for enrichment of terms on already annotated genes
5.8 years ago
Benni ▴ 30

I have a matrix like this:

  1. Column = Genes
  2. Column = Cluster gene belongs to
  3. Column = Annotation1
  4. Column = Annotation2
  5. Column = Annotation3

A new matrix would look like this:

  1. Column = Cluster
  2. Column = Annotation Terms and their pValue.

It checks one column and one term at a time. The background contains all rows (all genes) and the enrichment is checked for all genes in the same cluster.

I need a python modul, that takes those two datasets and calculates the pValue (with benjamini correction). Is there such a modul and which method should I use for the anrichment analysis?

For over enrichment using KEGG of clusters I use functions in the clusterprofiler package, I am pretty sure it would be easier to adapt this to your needs than write it out in Python. There is also a similar function in the Limma package for simple over enrichment testing.


