Markus Piotrowski gave a nice explanation and solution
The aim of a pairwise alignment algorithm is to achieve an alignment with the best score (or the least costs). From the view of the algorithm (here Needleman-Wunsch-Gotoh) the situation "one gap in seqA followed by one gap in seqB" is not forbidden. The fact that the older implementation of this algorithm in Biopython's pairwise2 did not allow this, was a shortcoming and was criticized e.g. in this paper. That was the reason for re-writing pairwise2 for Biopython 1.68 (in 2016) (with the side effect, however, unrelated to the original issue, of increased performance regarding speed and memory usage). As you see, the alignment with the "spurious insertion" in your example has the same score as the other one, and in fact, the other alignment is also produced (would have index ).
This behavior is documented in the docstring of the module which can be read in our API documentation, in detail:
Depending on the penalties, a gap in one sequence may be followed by a gap in the other sequence.
If you don’t like this behaviour, increase the gap-open penalty:
>>> for a in pairwise2.align.globalms("A", "T", 5, -4, -1, -.1):
>>> for a in pairwise2.align.globalms("A", "T", 5, -4, -3, -.1):
This also explains how you can prevent this: by increasing the gap penalty (two gaps should be more expensive than one mismatch). In fact, that is what most "standard settings" in alignment programs do.
Your example shows an amino acid alignment and I doubt that globalxx comes close to meaningful alignments in such cases (with 'real' proteins). You might consider using a substitution matrix:
>>> from Bio import pairwise2 as pw
>>> from Bio.Align import substitution_matrices
Warning (from warnings module): ...
>>> blosum62 = substitution_matrices.load("BLOSUM62")
>>> aln = pw.align.globalds("EPQYEEIPIYL", "EPQ*EEIPIYL", blosum62, -10, -0.5) # replacing your ? with a * which is in the matrix
Alternatively, if you want to keep the simple matching scheme, use globalms to set different gap parameters:
>>> aln = pw.align.globalms("EPQYEEIPIYL", "EPQ?EEIPIYL", 1, -1, -2, -.5)
# identical to: aln = pw.align.globalms("EPQYEEIPIYL", "EPQ?EEIPIYL", match=1, mismatch=-1, open=-2, extend=-.5)
Finally, if you are doing many pairwise alignments, you should consider using our new PairwiseAligner which has a much better performance and other advantages (the latter because it is implemented as a class).