Entering edit mode
4.2 years ago
Morgan S. ▴ 80
I have googled this and can not find any advice or answer. Hopefully, I have not overlooked it. I need phmmer to only write the top hit for each query I provide. Currently, it provides over a 100 hits for almost every query, which I have 12,000 of. It would take me way too much time to sort through all this information. Is it probably best to just use Blast instead where I can set this threshold? Not sure if it matters, but I set the evalue to 1e-3.
Are you just looking for the longest substring in the sequence on each iteration?
I don't think I understand your question. I used a protein fasta file made up of all the predicted genes in my genome. When I searched it against the MEROPS database, it gave me over 100 matches for each gene. I only want phmmer to give me the top hit from the MEROPS database, based on the evalue, for each gene, is there a way to do this? I thought --domZ would do the trick, but it didn't. In the manual it says --domZ : Assert that the total number of targets in your searches is <x>, for the purposes of per-domain conditional E-value calculations, rather than the number of targets that passed the reporting thresholds. Here is my script.
Has anybody figured out an answer to this yet? I am looking for PHMMer equivalent of