Bio.Align using smith-waterman local alignment causes memory leak
0
0
Entering edit mode
3.3 years ago

Hi!

I have a list of permutations of the DNA sequences where the alignment score of the sequence pairs is obtained. I don't know why this process is causing memory leak when the permutation list is big. Here example of score calculation:

for sequence1, sequence2 in sequence_permutation:
    score = self.__calculate_sequence_similarity(sequence1, sequence2)
    alignments[sequence1].append(sequence2)

save_aligments(alignments)

def __calculate_score_alignment(self, sequence1, sequence2): 
    from Bio.Align import substitution_matrices
    from Bio import Align
    from Bio.SubsMat import MatrixInfo

    aligner = Align.PairwiseAligner()
    aligner.mode = 'local'
    aligner.substitution_matrix = substitution_matrices.load('BLOSUM62')
    return aligner.score(sequence1, sequence2)


def __calculate_sequence_similarity(self, sequence1: str, sequence2: str) -> float:         
    if not sequence1 and not sequence2:
        return -1

    score = self.__calculate_score_alignment(sequence1, sequence2)
    score1 = self.__calculate_score_alignment(sequence1, sequence1)
    score2 = self.__calculate_score_alignment(sequence2, sequence2)

    return score / (math.sqrt(score1) * math.sqrt(score2))
python dna sequence alignment smith waterman • 926 views
ADD COMMENT
0
Entering edit mode

A memory leak is a software bug. If it doesn't originate in your code but in a library you're using you should report it to the library's authors. Make sure though that it is really a memory leak and not simply large memory usage caused by having a large data set. Also note that many scripting languages like python may not return all used memory to the system until after the script has exited so if your script creates a data structure using half the available RAM then most of this will stay associated with the script process even if the corresponding data structure has been destroyed.

ADD REPLY
0
Entering edit mode

The program memory increases in each interaction. So, It is not due to the dataset size, It may be some object that has destroyed as you wrote before. The object Aligner has created In each interaction, so I can't see an error with that code.

ADD REPLY
0
Entering edit mode

This is something to report as an issue on the biopython github repository if you are confident its a real problem with the library.

ADD REPLY

Login before adding your answer.

Traffic: 1324 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6