Question: Trying To Get Around Memoryerror: Out Of Memory Exception In Biopython Program
0
gravatar for hpodschwit
2.7 years ago by
hpodschwit0
hpodschwit0 wrote:

Hello all,

First, many of you will have to forgive my minimal programming skills but I am trying to write a program that will take 2 paired very large multiple-sequence FASTA file (each have >77,000 sequences) and get an alignment score of the sequences. My current implementation of the problem has had the problem that it uses to much memory. To get around this I split the multiple-sequence fasta files into a bunch of smaller files to feed in one-by-one. Still after getting around the 38th score I get a MemoryError: Out of memory exception. I will attach my code but I was wondering if anyone had any suggestions of fixing this issue or simply getting the alignment scores for the list. Please help I hope I am being clear and again forgive and verbose/ bad code,

I tried to copy and paste my code but it ended up being really messy so I have simply attached a link.

Anyways any help is greatly appreciated and I hope you are all having a great Tuesday and PLEASE PLEASE PLEASE HELP IF YOU ARE ABLE TO!!

ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by hpodschwit0

Perhaps you can use pseudocode to describe what you are doing. Are you doing pairwise alignments, for example?

ADD REPLYlink written 2.7 years ago by Sean Davis17k

I can't see your link, so it is hard to guess what you are doing - unless you are trying to load 77000 sequences in RAM at the same time which would be most unwise. Have you considered existing tools for this kind of thing, e.g. EMBOSS needleall http://emboss.open-bio.org/wiki/Appdoc:Needleall (you can parse the EMBOSS alignment output with Biopython)

ADD REPLYlink written 2.7 years ago by Peter4.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 702 users visited in the last hour