Entering edit mode
5.2 years ago
fec2
▴
50
Hi all,
I have multiple blastp output (format 6) in a directory, I wish to calculate the number of hits with sequence identity of more than 40% for each output file, therefore, I have tried:
for i in *.tsv; do awk '$3>=40' $i | wc -l; done
However, this command only give me a list of number in the terminal without matching it with the blastp output, any modification that I can do so that I know the number belongs to which blastp output file? Thank you.
Thank you very much! Is it possible to get the result in an output file?
You're welcome. Try this:
The command working well, thank you again!
Hi fec2,
If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.
![Upvote|Bookmark|Accept](https://image.ibb.co/dfsqrx/upvote_accept_bookmark.png)
Hi thanks for your comment, I have accepted the answer. Thank you.
Hi,
May I know where can I find the manual for the meaning of all these command? I am new in this field and but have no clue where to find all these information. Really appreciate your help.
Hello fec2,
You can use
man echo
,man awk
,man sort
, andman wc
. I'd recommend "Ch3. Remedial Unix Shell" and "Ch7. Unix Data Tools" in the book: Bioinformatics Data Skills by Vince Buffalo. :-)