Specify a blosum matrix in multiple sequence alignment using ClustalwCommandline
0
0
Entering edit mode
2.0 years ago
rezaeir75 ▴ 10

I want to use a specific blosum matrix for doing multiple sequence alignment using biopython. There are several options for doing MSA in biopython, for example I used clustalw. However, if I want to use its pwmatrix option to specify a blosum matrix, there is no kwarg for doing so. There is only an option to specify a path to matrix file, which when I specify the path of blosum62.cmp file, it gives me an error, saying 'WARNING: residue i in matrix blosum62.cmp not recognised'. I don't know how to specify a particular blosum matrix for alignment. My goal is to do my alignment using a specifc user-defined substitution matrix, therefore if anyone knows another python module to do so, I'll appreciate if you could mention its name. My Code:

from Bio.Align.Applications import ClustalwCommandline
cline = ClustalwCommandline('clustalw', infile = 'opuntia.fasta')
cline.pwmatrix = 'blosum62.cmp'
cline()


output:

ApplicationError                          Traceback (most recent call last)
<ipython-input-30-c418dc64da4a> in <module>
----> 1 cline()

~/anaconda3/lib/python3.7/site-packages/Bio/Application/__init__.py in __call__(self, stdin, stdout, stderr, cwd, env)
526         if return_code:
527             raise ApplicationError(return_code, str(self),
--> 528                                    stdout_str, stderr_str)
529         return stdout_str, stderr_str
530

ApplicationError: Non-zero return code 1 from 'clustalw -infile=opuntia.fasta -pwmatrix=blosum62.cmp', message
'WARNING: residue i in matrix blosum62.cmp not recognised'

alignment • 786 views