Trying To Get Around Memoryerror: Out Of Memory Exception In Biopython Program
0
0
Entering edit mode
9.4 years ago
hpodschwit • 0

Hello all,

First, many of you will have to forgive my minimal programming skills but I am trying to write a program that will take 2 paired very large multiple-sequence FASTA file (each have >77,000 sequences) and get an alignment score of the sequences. My current implementation of the problem has had the problem that it uses to much memory. To get around this I split the multiple-sequence fasta files into a bunch of smaller files to feed in one-by-one. Still after getting around the 38th score I get a MemoryError: Out of memory exception. I will attach my code but I was wondering if anyone had any suggestions of fixing this issue or simply getting the alignment scores for the list. Please help I hope I am being clear and again forgive and verbose/ bad code,

I tried to copy and paste my code but it ended up being really messy so I have simply attached a link.

Anyways any help is greatly appreciated and I hope you are all having a great Tuesday and PLEASE PLEASE PLEASE HELP IF YOU ARE ABLE TO!!

biopython alignment scoring • 3.1k views
ADD COMMENT
0
Entering edit mode

Perhaps you can use pseudocode to describe what you are doing. Are you doing pairwise alignments, for example?

ADD REPLY
0
Entering edit mode

I can't see your link, so it is hard to guess what you are doing - unless you are trying to load 77000 sequences in RAM at the same time which would be most unwise. Have you considered existing tools for this kind of thing, e.g. EMBOSS needleall http://emboss.open-bio.org/wiki/Appdoc:Needleall (you can parse the EMBOSS alignment output with Biopython)

ADD REPLY

Login before adding your answer.

Traffic: 1143 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6