Question: (Closed) How to use awk to look for the lowest e-value field?
0
jxi21 • 0 wrote:
Hello!, I am trying to parse some results given by HMMER and in the tblout file I was able to isolate the matches I want.
Nonetheless, the same value is being repeated several times if it matches to just one profile.
For example, this is one read is repeated 3 times:
SRR6033660.161030 FAM007172 4e-15 4.2e-15 63.4 63.4
SRR6033660.1458607 FAM019859 2.5e-12 2.7e-12 55.0 54.9
SRR6033660.1458607 FAM015326 4e-14 4.2e-14 58.8 58.7
SRR6033660.1458607 FAM000764 7.5e-25 8.1e-25 94.6 94.5
It matches to 3 families, nonetheless I just want to select the row which has the lowest e-values (3rd and 4th columns)
How can I write an awk command that gives me this output?
SRR6033660.161030 FAM007172 4e-15 4.2e-15 63.4 63.4
SRR6033660.1458607 FAM000764 7.5e-25 8.1e-25 94.6 94.5
Thanks!
Since you're here,
sort
can do this with the-g
option.Hello jxi21!
We believe that this post does not fit the main topic of this site.
This is an awk question, right? Please search stackoverflow. On the other hand, if your aim is to pick entries with the least p-value, I'll reopen the question and not restrict the tool to
awk
unless there's good reason.For this reason we have closed your question. This allows us to keep the site focused on the topics that the community can help with.
If you disagree please tell us why in a reply below, we'll be happy to talk about it.
Cheers!