Question: To get number of hits from blastp output file
0
gravatar for fec2
26 days ago by
fec220
fec220 wrote:

Hi all,

I have multiple blastp output (format 6) in a directory, I wish to calculate the number of hits with sequence identity of more than 40% for each output file, therefore, I have tried:

for i in *.tsv; do awk '$3>=40' $i | wc -l; done

However, this command only give me a list of number in the terminal without matching it with the blastp output, any modification that I can do so that I know the number belongs to which blastp output file? Thank you.

genome • 103 views
ADD COMMENTlink modified 26 days ago by SMK1.4k • written 26 days ago by fec220
3
gravatar for SMK
26 days ago by
SMK1.4k
Ghent, Belgium
SMK1.4k wrote:

Hi fec2,

Try:

for i in *.tsv; do echo -ne "${i}\t" && awk '$3>=40{print $2}' ${i} | sort -u | wc -l; done

It should returns something like:

blast_out1.tsv  217
blast_out2.tsv  172
blast_out3.tsv  215
ADD COMMENTlink modified 18 days ago • written 26 days ago by SMK1.4k

Thank you very much! Is it possible to get the result in an output file?

ADD REPLYlink written 26 days ago by fec220

You're welcome. Try this:

(for i in *.tsv; do echo -ne "${i}\t" && awk '$3>=40{print $2}' ${i} | sort -u | wc -l; done) > output.txt
ADD REPLYlink written 26 days ago by SMK1.4k

The command working well, thank you again!

ADD REPLYlink written 26 days ago by fec220

Hi fec2,

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.
Upvote|Bookmark|Accept

ADD REPLYlink modified 26 days ago • written 26 days ago by lieven.sterck5.2k
1

Hi thanks for your comment, I have accepted the answer. Thank you.

ADD REPLYlink modified 25 days ago • written 25 days ago by fec220

Hi,

May I know where can I find the manual for the meaning of all these command? I am new in this field and but have no clue where to find all these information. Really appreciate your help.

ADD REPLYlink written 12 days ago by fec220
1

Hello fec2,

You can use man echo, man awk, man sort, and man wc. I'd recommend "Ch3. Remedial Unix Shell" and "Ch7. Unix Data Tools" in the book: Bioinformatics Data Skills by Vince Buffalo. :-)

ADD REPLYlink modified 12 days ago • written 12 days ago by SMK1.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1024 users visited in the last hour