Traffic: 314 ip/hr
Question: Pariwise local and global alignment
 
4
 
 

Hi,

I like to build python script which can take fasta file and can run either pairwise local or global alignment of all sequences in a fasta file and completing by printing scores in the tables.

I read Bio.pairwise2 but couldn't find in detail how to intake fasta file and moreover pairwise2 seems to be slow alignment module. Is there any alternative beside pariwise2 if not then what are my options?

3 answers

 
8
 
 
 

As Neil notes, the EMBOSS package provides the functionality you are after, and BioPython has decent wrappers for constructing command line calls:

From the BioPython docs:

>>> from Bio.Emboss.Applications import WaterCommandline
>>> cline = WaterCommandline(gapopen=10, gapextend=0.5)
>>> cline.asequence = "asis:ACCCGGGCGCGGT"
>>> cline.bsequence = "asis:ACCCGAGCGCGGT"
>>> cline.outfile = "temp_water.txt"
>>> print cline
water -outfile=temp_water.txt -asequence=asis:ACCCGGGCGCGGT -bsequence=asis:ACCCGAGCGCGGT -gapopen=10 -gapextend=0.5
>>> cline
WaterCommandline(cmd='water', outfile='temp_water.txt', asequence='asis:ACCCGGGCGCGGT', bsequence='asis:ACCCGAGCGCGGT', gapopen=10, gapextend=0.5)

You would typically run the command line via a standard Python operating system call (e.g. using the subprocess module).

 
 
4
 
 
 

The entry about the Needleman-Wunsch algorithm in wikipedia contains a nice pseudocode that should be easy to implement.

 
 
3
 
 
 

The EMBOSS package implements global alignment (stretcher, needle) and local alignment (matcher, supermatcher, water). I believe that Biopython has a Bio.Emboss module to interact with EMBOSS. Otherwise, it should not be too difficult to call the programs from python and parse the output.

 
Log in to add a post