InterProScan and HMMER - different results
0
2
Entering edit mode
3.1 years ago
yp19 ▴ 70

Hi all!

I've run both InterProScan and HMMER because im interested in pfam hits for a set of proteins. Interproscan was run like so:

./interproscan.sh -i my_prot.faa -f tsv


and then i filtered the results for pfam hits with E value < 0.001

HMMER was run like so:

hmmscan --tblout hmmer_result.txt -E 0.001 Pfam-A.hmm my_prot.faa


I compared the outputs and for some reason, InterProScan finds less proteins with pfam hits than HMMER (approximately 150 proteins less). I checked and they are both using the most recent pfam database (32.0) so I'm not sure why this could be happening. Any ideas ?

Thank you!

interproscan hmmer pfam • 1.5k views
2
Entering edit mode

Do you know what is the exact HMMER command InterProScan is using? Do you know if / how InterProScan filters input and output? You may have to dig InterProScan logs to find out these details.

0
Entering edit mode

Thank you. No I do not know the exact command. I figured the output was not filtered since I have some large evalues (e.g. 40). Do you know where I can find these logs? I made it this far https://github.com/ebi-pf-team/interproscan/tree/master/core but i'm not sure where to go from here.

2
Entering edit mode

My guess is that the multiple hypotheses correction is different, probably interproscan scans more profiles and has a more profound correction. Can you validate the correspondence between the e-values? Are you loosing the high e-value results?

0
Entering edit mode

Thank you for the suggestion. Yes it looks like I am losing the high e-value results. although, there are only 30 of these proteins with large (>0.001) evalues and I am missing 150 proteins in total (in comparison to HMMER)..... Perhaps there is some filtering on the evalues that interproscan is doing (after multiple testing)

0
Entering edit mode

Please do not delete posts. The purpose of this site is two-fold: more immediately, to help people with their questions; but on the long run, to serve as a repository of knowledge. The second purpose is defeated if people delete their questions.