Question: Cross referencing biosynthetic gene clusters in metagenome, best way to do this?
gravatar for robert.murphy
12 months ago by
robert.murphy30 wrote:

I have around 20 actinobacteria genomes, isolated from species X gut, of decent quallity that I am minning for biosynthetic gene clusters (BGCs) using AntiSMASH and Prism. I was to cross reference BGCs I find with a large but highly fragmented metagenomic dataset of not great quallity, also isolated from species X gut, that we have. Thus the BGCs will not exist in full within the metagenome. What would be the best way to search for the BGCs i find in the actinobacteria genomes within the metagenome?

I am thinking using conserved regions of the BGCs would be best, but how do I determine a conserved region? Or would just searching for key genes of the BGC be a better approach?

Any advice would be awesome :) thank you!

genome • 224 views
ADD COMMENTlink modified 12 months ago by Mensur Dlakic8.3k • written 12 months ago by robert.murphy30
gravatar for Mensur Dlakic
12 months ago by
Mensur Dlakic8.3k
Mensur Dlakic8.3k wrote:

I don't know if this is a best way to do what you ask, but it is a way.

  • Predict genes from your metagenomic contigs (prodigal does this easily)
  • Create alignments for all BGC genes of interest, and convert them to hidden Markov models (HMMs). Some or all HMMs for your genes may already be available in HMM databases such as Pfam, TIGRFam, SMART, COG or KOG, and in such cases you could use them directly rather than creating custom alignments
  • Search with these HMMs individually against your predicted genes, or concatenate HMMs into a small database and compare your genes to it. If you are familiar with genome annotations using Pfam, this would be the same but with a smaller, custom database of HMMs
ADD COMMENTlink written 12 months ago by Mensur Dlakic8.3k

would hmmscan be an adiquate tool for step 3 of searching the HMMs against predicted genes?

ADD REPLYlink written 12 months ago by robert.murphy30

It depends on what exactly you wish to do. This is how I understand it: hmmsearch scores sequences against a single HMM; hmmscan scores a database of HMMs against a single sequence; hmmpfam scores multiple HMMs against multiple sequences. This post may help you decide.

ADD REPLYlink written 12 months ago by Mensur Dlakic8.3k

How would I go about making alignments a biosynthetic gene cluster and then make a HMM ? Or if you mean of only specific genes inside the BGC how sould I pick genes to use and then again I I am unsure what you mean by make alignments.

ADD REPLYlink written 11 months ago by robert.murphy30
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1847 users visited in the last hour