Biopython SeqIO.parse not functioning for all entries in a seq
0
0
Entering edit mode
3.9 years ago
mvhanson • 0

I'm working with some genbank seq files and have the following code:

for seq_record in SeqIO.parse("datafile_location, "genbank"):

And while it can run through most of the seqs in the seq file (which contains multiple seqs) I get the following error. Any thoughts about how to fix this?

Maybe delete the offending seq? It gets to record 92126 of 93145 and then throws the error.

I have tried re-downloading the seq file, but that doesn't fix the problem.

File "C:\python38\lib\site-packages\Bio\GenBank\Scanner.py", line 516, in parse_records record = self.parse(handle, do_features) File

C:\python38\lib\site-packages\Bio\GenBank\Scanner.py", line 499, in parse if self.feed(handle, consumer, do_features): File

"C:\python38\lib\site-packages\Bio\GenBank\Scanner.py", line 466, in feed self._feed_header_lines(consumer, self.parse_header()) File

"C:\python38\lib\site-packages\Bio\GenBank\Scanner.py", line 1801, in feed_header_lines previous_value_line = structured_comment_dict[ KeyError: 'Assembly-Data'

SeqIO biopython SeqIO.parse genbank files • 1.4k views
ADD COMMENT
1
Entering edit mode

Try to install Biopython from the github source. It seems this issue is solved there, see the discussion here.

ADD REPLY
0
Entering edit mode

Can you post your code here? There is a " missing in the for seq_record in SeqIO.parse("datafile_location, "genbank"): loop but I guess that's a typo in the post as opposed to the code?

ADD REPLY

Login before adding your answer.

Traffic: 1976 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6