Question: Help to filter blastp results
0
gravatar for guillaume.rbt
2.8 years ago by
guillaume.rbt590
France
guillaume.rbt590 wrote:

Hi all,

I'm trying to blast two sets of protein against each other to find similarities.

I'm using this command to do so : blastall -d set1.fasta -i set2.fa -p blastp -m 9 -e 0.01 -o results.blast

As the two sets are from the same sepcies, I would like to filter results to get only > 99% identity matching sequences, and with query and subject of same length. After filtering for % of identity sometimes I get results like this one:

Query id, Subject id, % identity, alignment length, mismatches, gap openings, q. start, q. end, s. start, s. end, e-value, bit score

protein_1 protein_2 100.00 76 0 0 1 76 1 76 3e-46 154

protein_1 protein_2 100.00 76 0 0 77 152 1 76 3e-46 154

protein_1 protein_2 100.00 76 0 0 153 228 1 76 3e-46 154

protein_1 protein_2 100.00 76 0 0 229 304 1 76 3e-46 154

Here 4 parts of the protein 1 blast to the same sequence of protein 2. As I only want Hits with protein of the same length I would like to filter out those kinds of results, but I don't know how. Would anyone know a parameter that could do that, or a way to filter the result file?

Thanks,

blast filter fasta • 1.4k views
ADD COMMENTlink modified 2.8 years ago • written 2.8 years ago by guillaume.rbt590
1

You don't have information of query and subject sequence lengths in that table so it's not possible. With blast+ you could include qlen and slen in your output rows. I don't know if you can do that with legacy blast..

ADD REPLYlink written 2.8 years ago by 5heikki8.4k

Thanks, it works well with blast+.

ADD REPLYlink written 2.8 years ago by guillaume.rbt590

How large are your two sets? Possibly its easier to make simple pairwise alignments of those proteins which have the same length. In Biopython you may use the pairwise2 module for this task (e.g. alignment = pairwise2.align.globalxx(seq1, seq2, score_only=True). For this example the score of the alignment should equal the lenght of the protein if the two proteins are 100% identical).

ADD REPLYlink modified 2.8 years ago • written 2.8 years ago by Markus250
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 971 users visited in the last hour