Hello all,

This is a very general inquiry just to brainstorm some ideas. I'm trying to create what I would like to call a "Prioritization Matrix". What I would like to do is more or less give a score to gene clusters that are of more interest (I am aware that this is very subjective to what the use will want), where a high score has better priority.

In this case I have a matrix with something like this:

Gene Cluster (GC) | Gene Cluster Family (GCF) | GC Type | Associated Product | Priority Score*

Example_GC1 | GCF 134 | Terpene | Bacteriocin | 8

Example_GC2 | GCF134 | Saccharide | Bacteriocin | 8

Example_GC5 | GCF145 | Other | Penicilin | 5

  • Gene Cluster: Biosynthetic Gene Cluster
  • GCF: Gene Clusters that have similar domains
  • GC Type: Can vary from polyketide to terpenes etc.
  • Associated Product: Product that is known to express associated to Gene Cluster
  • Priority Score: If interested in investigating or not this gene cluster

I want to determine my gene score based on 4 main metrics:

  1. Characterized or Not Characterized by LCMS (I have a database for this)
  2. Data Source: If it is present or not in the NCBI
  3. BGC Class Type: I will give a higher score to the first three classes I am interested in searching b. Score can vary according to interest
  4. Similar Domain or Not: If these are part of the same GCF

If anyone has any particular Ideas or suggestions this would be great. :)

Think this has been done -


