Question: Trying To Get Around Memoryerror: Out Of Memory Exception In Biopython Program
gravatar for hpodschwit
5.4 years ago by
hpodschwit0 wrote:

Hello all,

First, many of you will have to forgive my minimal programming skills but I am trying to write a program that will take 2 paired very large multiple-sequence FASTA file (each have >77,000 sequences) and get an alignment score of the sequences. My current implementation of the problem has had the problem that it uses to much memory. To get around this I split the multiple-sequence fasta files into a bunch of smaller files to feed in one-by-one. Still after getting around the 38th score I get a MemoryError: Out of memory exception. I will attach my code but I was wondering if anyone had any suggestions of fixing this issue or simply getting the alignment scores for the list. Please help I hope I am being clear and again forgive and verbose/ bad code,

I tried to copy and paste my code but it ended up being really messy so I have simply attached a link.

Anyways any help is greatly appreciated and I hope you are all having a great Tuesday and PLEASE PLEASE PLEASE HELP IF YOU ARE ABLE TO!!

alignment scoring biopython • 1.6k views
ADD COMMENTlink modified 5.4 years ago • written 5.4 years ago by hpodschwit0

Perhaps you can use pseudocode to describe what you are doing. Are you doing pairwise alignments, for example?

ADD REPLYlink written 5.4 years ago by Sean Davis23k

I can't see your link, so it is hard to guess what you are doing - unless you are trying to load 77000 sequences in RAM at the same time which would be most unwise. Have you considered existing tools for this kind of thing, e.g. EMBOSS needleall (you can parse the EMBOSS alignment output with Biopython)

ADD REPLYlink written 5.4 years ago by Peter5.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1412 users visited in the last hour