Scan multiple sequences on multiple hmm profiles
1
0
Entering edit mode
12 months ago
biobiu ▴ 150

I want to "align" multiple protein sequences in a multi-fasta file against thousand of hmm profiles (.hmm) that I've downloaded. I though on using hmmscan.

  1. Should I do that on each profile separately? Or is there a way to work on multiple profile to scan all-vs-all?
  2. Should I process the hmm file with hmmpress?
Hmm hmmer • 964 views
ADD COMMENT
2
Entering edit mode
12 months ago
Mensur Dlakic ★ 27k

Because you wrote "align" I am assuming that you actually want to perform a search and pairwise alignments as a result. If you truly want to align sequences, you will need the aptly named hmmalign.

Yes, hmmscan is your tool. You need to process your HMM database with hmmpress and multiple sequences can be searched against multiple HMMs in one pass:

hmmscan -o aln_hits.out --tblout table_hits.out -E 0.01 --cpu 8 DB.hmm proteins.fa

In aln_hits.out you will get matches of each sequence, sequentially, to the the whole HMM database. Some sequences may not have any matches in which case you get [No hits detected that satisfy reporting thresholds]. Tabular output in table_hits.out is more compact as it contains only matches that satisfy the E-value threshold, so some sequences will be completely absent from it.

ADD COMMENT
0
Entering edit mode

Thank you so much Mensur. One thing that is not clear to me is how to build one database from multiple hmm files (the documentation of hmmpress seem to be for one .hmm file at a time).

ADD REPLY
0
Entering edit mode

Based on that reply it seems that all profiles should be concatenated first! Thank you.

ADD REPLY
1
Entering edit mode

Yes, all HMMs should be first concatenated into a single file - which I randomly named DB.hmm in the example above - and hmmpress-ed before use. After pressing several files will be created with .h3? extensions, but the database should still be referred to in commands by its root name (DB.hmm).

If my answers helped or completely solved your problem, consider upvoting or accepting them.

ADD REPLY

Login before adding your answer.

Traffic: 2611 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6