How to used extraction above 90% identity with blastp?
1
0
Entering edit mode
4.0 years ago
ioer0417 ▴ 20

I used command this

$ blastp -query file.fa -db dbfile.fa -outfmt "7" -evalue 1e-4 -out outfile.fa

But out file is a lot of information.

I need only above 90% identity information.

How to used extraction above 90% identity with blastp?

blastp blast protein identity extract • 3.2k views
ADD COMMENT
3
Entering edit mode
4.0 years ago
Asaf 10k

The third column of your output file is the percent identity (pident), you can use awk to filter according to it:

awk '$3>=90' outfile.fa

And you would probably want to change the output file from outfile.fa to outfile.txt since it's not a fasta file.

ADD COMMENT
1
Entering edit mode

Asaf is a good answer, but in case you want to have this filtering with blast alone, you can use the parameters:

 -perc_identity <Real, 0..100>
   Percent identity
 -qcov_hsp_perc <Real, 0..100>
   Percent query coverage per hsp
ADD REPLY
0
Entering edit mode

You're right although I've seen some cases in which this filtering failed, maybe they fixed it in the new version.

ADD REPLY
1
Entering edit mode

As I understand, all blast filters are applied at an early stage of the algorithm, which means using the blast arguments may produce different results from filtering the output afterwards. There is an excellent series of posts at the blog Blasted Bioinformatics!?, e.g., What BLAST's max-target-sequences doesn't do. Due to these potential differences, it is often recommended to perform blast with "lax" parameters and filter the results afterwards.

ADD REPLY
0
Entering edit mode

Exactly, thanks for the references, I had a vague memory of this explanation.

ADD REPLY
0
Entering edit mode

so why blastn have these parameters while blastp do not have parameter "-perc_identity" ?

ADD REPLY
0
Entering edit mode

Thank you!

I have another question.

Can I?

I need a file like data1.

I want to files like data2, data3 is delete.

How can I try?

My result file information this

data1

Query: XP_1234

Database: data.fasta

Fields : query

1111 hits

XP_1234 XP_1234 100.00

data2

Query: XP_2345

Database: data.fasta

Fields : query

1231 hits

XP_2345 XP_2345 100.00

XP_2345 XP_3456 100.00

XP_2345 XP_4567 94.00

data.3

Query: XP_3456

Database: data.fasta

Fields : query

2343 hits

XP_3456 XP_2345 100.00

XP_3456 XP_6783 94.00

XP_3456 XP_8362 92.33

ADD REPLY
0
Entering edit mode

I'm sorry, I couldn't understand you

ADD REPLY

Login before adding your answer.

Traffic: 2788 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6