Any tool to generate protein similarity matrix
1
0
Entering edit mode
10.0 years ago
Zealseeker • 0

Hi,

I want to calculate the similarities between each protein pairwise and then implement random walk to predict interactions. A paper mentions "J. Mol. Biol. (1981) 147, 195-197" to compare two proteins, but I am afraid that may be obsolete.

I'd like to use some popular tools like BLAST/FASTA..

Is there any good toolkit that can generate such matrix? Just similarity is enough.

Example    Protein1  Protein2  Protein3
Protein1   1         0.9       0.4
Protein2   0.9       1         0.7
Protein3   0.4       0.7       1

As far as I know, BLAST can only tell the identy between two proteins, and FASTA can also tell the similairy (though I don't know why). And FASTA is more quicker, I think.

Then another problem emerges. How to use FASTA by python? the fasta35.exe (windows OS) is interactive like following.

>>fasta35 q.fasta lib.fasta
>>Enter filename for results []
>>output.txt
>>How many scores would you like to see? [20]

I'd prefer that it could output the whole consequent without any stdin. And then I can mine the results via python.

sequence blast • 4.6k views
ADD COMMENT
2
Entering edit mode
10.0 years ago
Joseph Hughes ★ 3.0k

I would use clustalo with the following command:

clustalo -i my-in-seqs.fa -o my-out-seqs.fa -v --distmat-out=output_mat.txt

The --distmat-out will give you a pairwise distance matrix for each protein sequence that you have in your input fasta file.

Hope this helps,

Joseph

ADD COMMENT
0
Entering edit mode

It's a good toolkit and now I'm learning to use it.

The distances ranging from 0,meaning the same, to 1, meaning totally different. Most distances are larger than 0.8...

ADD REPLY

Login before adding your answer.

Traffic: 2011 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6