Something that should be so simple has been so difficult for me to resolve the past few weeks. I am using pHMMER to search my ~12,000 fungal gene predictions against the MEROPS database. The issue is that for some of my gene queries, over a 1,000 hits will be returned, making it impossible to sort through all queries for their top hit. There are so many hits that not all of them can fit in one excel spreadsheet.
**Edit Here is a subset of what my table looks like. As you scroll down you can see that for just the first gene there are almost 1,000 hits.
Well after contacting the developer, he suggested redirecting the main output to /dev/null so that only the top hit of each query remains. He said the script should look like this
phmmer --tblout 1371E_merops5.tbl /work/Geomicrobiology/msobol/IODP_329_SPG/1371E14H2/maker/1371E_uni_snap.maker.output/1371E_uni_snap.renamed.maker.proteins.fasta /work/Geomicrobiology/msobol/databases/pepunit.lib > /dev/null; head -4 1371E_merops5.tbl
However, this still does not work, and the developer says it has to do with the command line, not the program. Does anyone here have experience with this???
Thanks in advance! Morgan