Cross referencing biosynthetic gene clusters in metagenome, best way to do this?
1
0
Entering edit mode
4.2 years ago

I have around 20 actinobacteria genomes, isolated from species X gut, of decent quallity that I am minning for biosynthetic gene clusters (BGCs) using AntiSMASH and Prism. I was to cross reference BGCs I find with a large but highly fragmented metagenomic dataset of not great quallity, also isolated from species X gut, that we have. Thus the BGCs will not exist in full within the metagenome. What would be the best way to search for the BGCs i find in the actinobacteria genomes within the metagenome?

I am thinking using conserved regions of the BGCs would be best, but how do I determine a conserved region? Or would just searching for key genes of the BGC be a better approach?

Any advice would be awesome :) thank you!

genome • 905 views
ADD COMMENT
4
Entering edit mode
4.2 years ago
Mensur Dlakic ★ 27k

I don't know if this is a best way to do what you ask, but it is a way.

  • Predict genes from your metagenomic contigs (prodigal does this easily)
  • Create alignments for all BGC genes of interest, and convert them to hidden Markov models (HMMs). Some or all HMMs for your genes may already be available in HMM databases such as Pfam, TIGRFam, SMART, COG or KOG, and in such cases you could use them directly rather than creating custom alignments
  • Search with these HMMs individually against your predicted genes, or concatenate HMMs into a small database and compare your genes to it. If you are familiar with genome annotations using Pfam, this would be the same but with a smaller, custom database of HMMs
ADD COMMENT
0
Entering edit mode

would hmmscan be an adiquate tool for step 3 of searching the HMMs against predicted genes?

ADD REPLY
0
Entering edit mode

It depends on what exactly you wish to do. This is how I understand it: hmmsearch scores sequences against a single HMM; hmmscan scores a database of HMMs against a single sequence; hmmpfam scores multiple HMMs against multiple sequences. This post may help you decide.

ADD REPLY
0
Entering edit mode

How would I go about making alignments a biosynthetic gene cluster and then make a HMM ? Or if you mean of only specific genes inside the BGC how sould I pick genes to use and then again I I am unsure what you mean by make alignments.

ADD REPLY

Login before adding your answer.

Traffic: 2526 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6