Question

Does nhmmer look at reverse complements?

1

Entering edit mode

8.0 years ago

Steven Lakin ★ 1.8k

nhmmer is the DNA pipeline version of hmmsearch within the HMMER software. Recently, I've been comparing two paired-end Illumina libraries with the following:

Trimmomatic
bwa aln, forward/reverse trimmed reads to database of known reference genes
Parsing the SAM file for mapped reads (samtools -F 4) and analyzing those

I ran the same raw read files through nhmmer and analyzed the intersection of the alignment and HMMER pipelines. When I look at reads that aligned but were not called by HMMER, bit 16 (reverse strand) is set on virtually all of them. Additionally, when the read information is pulled from the SAM file (where it has been appropriately reverse complemented) and re-run through HMMER, nhmmer does find ~70% of them.

So it appears that HMMER doesn't consider the reverse complement of the database being queried, though it does consider the reverse orientation when the query is of the same strandedness with respect to the database. This is in contrast to the manual, which claims that it does search both strands of the database, but this doesn't explain the findings here.

Has anyone had similar experiences, or does someone have inside knowledge on how or if nhmmer considers reverse complements?

alignment hmmer sam ngs • 2.3k views

ADD COMMENT • link 8.0 years ago by Steven Lakin ★ 1.8k

score 1 · Accepted Answer · 2016-05-12

1

Entering edit mode

8.0 years ago

Steven Lakin ★ 1.8k

HMMER does in fact do this correctly; it was an issue with an intermediate step written by myself. Lesson learned; do more internal testing on code.

ADD COMMENT • link 8.0 years ago by Steven Lakin ★ 1.8k