Speed of hmmsearch
0
1
Entering edit mode
8.2 years ago
kentnf ▴ 10

Hi,

I am using hmmbuild to build a HMM with 89 domains, and searching all arabidopsis proteins against the 89 domains using hmmsearch :

hmmsearch -Z 2000000 --domZ 89 --cpu 20 -o output.hmmsearch.txt stockholm89.hmm at_pep

It just cost less than 2 minutes to finish the search.

To speed this search, only 1300 interesting proteins were selected to perform the searching using the same command. But it takes about 20minutes.

I use the latest version of HMMER. Does any known the problem about it? Is it a bug for the hmmsearch program?

Thanks

software error • 2.9k views
ADD COMMENT
0
Entering edit mode

Are you saying that this took 2 minutes:

hmmsearch -Z 2000000 --domZ 89 --cpu 20 -o output.hmmsearch.txt stockholm89.hmm at_pep_N_seqs

While this took 20 minutes:

hmmsearch -Z 2000000 --domZ 89 --cpu 20 -o output.hmmsearch.txt stockholm89.hmm at_pep_1300_seqs

While at_pep_1300_seqs is a subset of 1,300 sequences from X sequences of at_pep_N_seqs? If yes, that sounds really weird.

BTW, if you have enough RAM and fast I/O then the below can be a lot faster than what you're doing. Split the input file into 20 parts and then (GNU parallel has to be in $PATH):

function hmmer() {
    n=$(basename "$1")
    hmmsearch -Z 2000000 --domZ 89 --cpu 1 -o $1.output.hmmsearch.txt stockholm89.hmm $1
}

export -f hmmer
find /where/the/split/files/are/ -maxdepth 1 -type f -name "*specific2splitFiles" | parallel -j 20 hmmer {}
ADD REPLY

Login before adding your answer.

Traffic: 1364 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6