Pairwise alignment of multi-FASTA file sequences
0
0
Entering edit mode
2.2 years ago
aurora • 0

I have multi-FASTA file containing more than 10 000 fasta sequences and I want to do pairwise alignment of each sequence to each sequence inside the file and store all the results in the same new file in order to perform clustering analysis after. My code for performing pairwise sequence alignment with python is written below and I am wondering how can I modify it to loop over whole multi-FASTA file and store results as needed.

from Bio import pairwise2
from Bio.pairwise2 import format_alignment

X = "ACGGGT"
Y = "ACG"

#A match score = 2, mismatch score = -1, gap opening = -5, gap extension = -2
alignments = pairwise2.align.globalms(X, Y, 2, -1, -5, -2)

for a in alignments:
    print(format_alignment(*a))
alignment next-gen sequence fasta pairwise • 1.2k views
ADD COMMENT
0
Entering edit mode

Have you had a look at itertools? https://docs.python.org/3.6/library/itertools.html I imagine this will help get you on the right track, but also this could be quite slow.

ADD REPLY

Login before adding your answer.

Traffic: 2748 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6