Question: To get number of hits from blastp output file
0
gravatar for fec2
12 months ago by
fec230
fec230 wrote:

Hi all,

I have multiple blastp output (format 6) in a directory, I wish to calculate the number of hits with sequence identity of more than 40% for each output file, therefore, I have tried:

for i in *.tsv; do awk '$3>=40' $i | wc -l; done

However, this command only give me a list of number in the terminal without matching it with the blastp output, any modification that I can do so that I know the number belongs to which blastp output file? Thank you.

genome • 282 views
ADD COMMENTlink modified 12 months ago by SMK1.9k • written 12 months ago by fec230
3
gravatar for SMK
12 months ago by
SMK1.9k
SMK1.9k wrote:

Hi fec2,

Try:

for i in *.tsv; do echo -ne "${i}\t" && awk '$3>=40{print $2}' ${i} | sort -u | wc -l; done

It should returns something like:

blast_out1.tsv  217
blast_out2.tsv  172
blast_out3.tsv  215
ADD COMMENTlink modified 12 months ago • written 12 months ago by SMK1.9k

Thank you very much! Is it possible to get the result in an output file?

ADD REPLYlink written 12 months ago by fec230

You're welcome. Try this:

(for i in *.tsv; do echo -ne "${i}\t" && awk '$3>=40{print $2}' ${i} | sort -u | wc -l; done) > output.txt
ADD REPLYlink written 12 months ago by SMK1.9k

The command working well, thank you again!

ADD REPLYlink written 12 months ago by fec230

Hi fec2,

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.
Upvote|Bookmark|Accept

ADD REPLYlink modified 12 months ago • written 12 months ago by lieven.sterck7.8k
1

Hi thanks for your comment, I have accepted the answer. Thank you.

ADD REPLYlink modified 12 months ago • written 12 months ago by fec230

Hi,

May I know where can I find the manual for the meaning of all these command? I am new in this field and but have no clue where to find all these information. Really appreciate your help.

ADD REPLYlink written 11 months ago by fec230
1

Hello fec2,

You can use man echo, man awk, man sort, and man wc. I'd recommend "Ch3. Remedial Unix Shell" and "Ch7. Unix Data Tools" in the book: Bioinformatics Data Skills by Vince Buffalo. :-)

ADD REPLYlink modified 11 months ago • written 11 months ago by SMK1.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1176 users visited in the last hour