Question: Combining HMMs and fasta for HMMER searches
0
gravatar for Anand Rao
2.9 years ago by
Anand Rao210
United States
Anand Rao210 wrote:

I have a software that translates DNA and searches the translation product for matches to profile HMMs. The idea is to search these DNA sequences for matches to HMMs in PfamA (version 30, which is the latest one)

However, for my purpose, there is a better curated set of proteins found as fasta sequences.

I was recently told that fasta sequences can ALSO be used as HMMER queries, they'll just be converted to HMM on the fly. That is fine.

What I seek help with is figuring out, if I have one set of queries as HMMs from PfamA and another set of queries as plain fasta sequences, is it possible to combine them into one *.hmm file in some way?

hmmer hmm fasta • 1.5k views
ADD COMMENTlink modified 2.9 years ago by Jean-Karim Heriche19k • written 2.9 years ago by Anand Rao210
0
gravatar for Jean-Karim Heriche
2.9 years ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche19k wrote:

Given an HMM and some sequences, you can use hmmalign to get a multiple sequence alignment then build a new HMM from it with hmmbuild. In your case, you would combine the PfamA sequences with your other sequences and align all to the PfamA HMM then rebuild the HMM from the resulting multiple sequence alignment.

ADD COMMENTlink written 2.9 years ago by Jean-Karim Heriche19k

Thanks but I dont think that is possible, because I suppose I did not make it abundantly clear that the FASTA sequences and the PfamA HMMs bear no relation to each other. So I cannot align the FASTA sequences to the HMMs using hmmalign, and re-build new/combined HMMs with hmmbuild. If you or someone else have another suggestion, I am open to ideas.

ADD REPLYlink written 2.9 years ago by Anand Rao210

As far as I know a *.hmm file contains one profile HMM generated by hmmbuild so combining sequences into one *.hmm file means building an HMM from all the sequences. If the sequences are not related to Pfam families then it makes sense to create separate HMMs and search your DNA sequences with these. However, you could also identify the Pfam domains represented in your proteins and add theses to the corresponding Pfam HMMs. I am also curious as to what "better curated" means. Pfam-A seems already a well curated resource.

ADD REPLYlink written 2.9 years ago by Jean-Karim Heriche19k

I am also curious as to what "better curated" means. Pfam-A seems already a well curated resource.

When looking at individual well studied organism annotations (e.g. E.coli), Pfam/Rfam provide a very general guess, while biologists need a specific gene name etc.

ADD REPLYlink written 5 weeks ago by predeus1.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2145 users visited in the last hour