Question: Passing Alignment Parameters Through Clustalwcommandline
2
gravatar for Thaman
9.3 years ago by
Thaman3.2k
Finland
Thaman3.2k wrote:

As i am trying to do more work in Multiple sequence alignment using clustalw wrapper in biopython, i am wondering how can i pass parameters through clustalcommandline which i fail to do repeatedly.

Parameters in details like :

  • Gap open penalty, Gap extension penalty, no end gap(yes, no), gap distance, weight matrix(blosum, pam etc), type (DNA,protein) and other optional parameters.

Right now i am working on default alignment settings provided by clustalw. As default settings are not fulfilling my interest, i am more keen in adding parameters in given below lines.

 import sys, subprocess
 from Bio import AlignIO
 from Bio.Align.Applications import ClustalwCommandline
 cline = ClustalwCommandline("clustalw",
      infile="opuntia.fasta")
 child = subprocess.call(str(cline),
      shell=(sys.platform!="win32"))

Moreover, i will be more pleased if you guys will explain how can i know the inputted fasta file is of nucleotide,DNA or protein.

Thanks for your interest

ADD COMMENTlink modified 12 months ago by RamRS24k • written 9.3 years ago by Thaman3.2k
5
gravatar for Giovanni M Dall'Olio
9.3 years ago by
London, UK
Giovanni M Dall'Olio26k wrote:

I didn't test this, but it all gets a lot clearer if you look at the source of Bio/Align/Applications/_Clustalw.py

  • Gap open penalty: -gapopen
  • Gap extension penalty: -gapext
  • no end gap(yes, no): -endgaps
  • gap distance: -gapdist
  • weight matrix(blosum, pam etc): -matrix ["BLOSUM", "PAM", "GONNET", "ID"]
  • type (DNA,protein): -type

note: if you use ipython, you can look at the code of a function easily, just type ClustalwCommandline??

In any case, notice that you can access all these options after having created a wrapper for the clustalw command line. For example:

>>> c = ClustalwCommandline(type='dna')
>>> dir(c)
['__class__', '__delattr__', '__dict__', '__doc__', '__getattribute__', '__hash__', 
'__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', 
'__setattr__', '__str__', '__weakref__', '_check_value', '_clear_parameter', 
'_get_parameter', '_validate', 'align', 'bootlabels', 'bootstrap', 'case', 'check', 
'clustering', 'convert', 'dnamatrix', 'endgaps', 'fullhelp', 'gapdist', 'gapext', 'gapopen', 
'helixendin', 'helixendout', 'helixgap', 'help', 'hgapresidues', 'infile', 'iteration', 
'kimura', 'ktuple', 'loopgap', 'matrix', 'maxdiv', 'maxseqlen', 'negative', 'newtree', 
'newtree1', 'newtree2', 'nohgap', 'nopgap', 'nosecstr1', 'nosecstr2', 'noweights', 'numiter', 
'options', 'outfile', 'outorder', 'output', 'outputtree', 'pairgap', 'parameters', 'profile', 
'profile1', 'profile2', 'program_name', 'pwdnamatrix', 'pwgapext', 'pwgapopen', 'pwmatrix', 
'quicktree', 'quiet', 'range', 'score', 'secstrout', 'seed', 'seqno_range', 'seqnos', 
'sequences', 'set_parameter', 'stats', 'strandendin', 'strandendout', 'strandgap', 
'terminalgap', 'topdiags', 'tossgaps', 'transweight', 'tree', 'type', 'usetree', 'usetree1', 
'usetree2', 'window']
>>> c.gapopen = -2
>>> print c
clustalw -type=dna -gapopen=-2
ADD COMMENTlink modified 17 days ago by RamRS24k • written 9.3 years ago by Giovanni M Dall'Olio26k
1

Also try typing help(c) at the python prompt to find out more about the command line wrapper object you've just created.

ADD REPLYlink written 9.3 years ago by Peter5.8k

Ok let me try and see whether my understanding will work or not. If not then i will again click you. Thanks

ADD REPLYlink written 9.3 years ago by Thaman3.2k

you are welcome... I don't want to be silly, but please consider voting up the answers that you find useful, even if you don't want to accept them :-)

ADD REPLYlink written 9.3 years ago by Giovanni M Dall'Olio26k

Can you answer my below query about inputted fasta file.

ADD REPLYlink written 9.3 years ago by Thaman3.2k

I think you should ask that as a separate question. In principle, neither clustalw nor Bio.Align.Applications from Biopython have tools to determine whether a sequence is protein or dna.

ADD REPLYlink written 9.3 years ago by Giovanni M Dall'Olio26k

Okei i will do that.

ADD REPLYlink written 9.3 years ago by Thaman3.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 618 users visited in the last hour