Question: Cross referencing biosynthetic gene clusters in metagenome, best way to do this?
gravatar for robert.murphy
8 weeks ago by
robert.murphy10 wrote:

I have around 20 actinobacteria genomes, isolated from species X gut, of decent quallity that I am minning for biosynthetic gene clusters (BGCs) using AntiSMASH and Prism. I was to cross reference BGCs I find with a large but highly fragmented metagenomic dataset of not great quallity, also isolated from species X gut, that we have. Thus the BGCs will not exist in full within the metagenome. What would be the best way to search for the BGCs i find in the actinobacteria genomes within the metagenome?

I am thinking using conserved regions of the BGCs would be best, but how do I determine a conserved region? Or would just searching for key genes of the BGC be a better approach?

Any advice would be awesome :) thank you!

genome • 107 views
ADD COMMENTlink modified 8 weeks ago by Mensur Dlakic4.0k • written 8 weeks ago by robert.murphy10
gravatar for Mensur Dlakic
8 weeks ago by
Mensur Dlakic4.0k
Mensur Dlakic4.0k wrote:

I don't know if this is a best way to do what you ask, but it is a way.

  • Predict genes from your metagenomic contigs (prodigal does this easily)
  • Create alignments for all BGC genes of interest, and convert them to hidden Markov models (HMMs). Some or all HMMs for your genes may already be available in HMM databases such as Pfam, TIGRFam, SMART, COG or KOG, and in such cases you could use them directly rather than creating custom alignments
  • Search with these HMMs individually against your predicted genes, or concatenate HMMs into a small database and compare your genes to it. If you are familiar with genome annotations using Pfam, this would be the same but with a smaller, custom database of HMMs
ADD COMMENTlink written 8 weeks ago by Mensur Dlakic4.0k

would hmmscan be an adiquate tool for step 3 of searching the HMMs against predicted genes?

ADD REPLYlink written 8 weeks ago by robert.murphy10

It depends on what exactly you wish to do. This is how I understand it: hmmsearch scores sequences against a single HMM; hmmscan scores a database of HMMs against a single sequence; hmmpfam scores multiple HMMs against multiple sequences. This post may help you decide.

ADD REPLYlink written 8 weeks ago by Mensur Dlakic4.0k

How would I go about making alignments a biosynthetic gene cluster and then make a HMM ? Or if you mean of only specific genes inside the BGC how sould I pick genes to use and then again I I am unsure what you mean by make alignments.

ADD REPLYlink written 7 weeks ago by robert.murphy10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1053 users visited in the last hour