Biopython SeqIO.parse not functioning for all entries in a seq
0
0
Entering edit mode
2.9 years ago
mvhanson • 0

I'm working with some genbank seq files and have the following code:

for seq_record in SeqIO.parse("datafile_location, "genbank"):

And while it can run through most of the seqs in the seq file (which contains multiple seqs) I get the following error. Any thoughts about how to fix this?

Maybe delete the offending seq? It gets to record 92126 of 93145 and then throws the error.

File "C:\python38\lib\site-packages\Bio\GenBank\Scanner.py", line 516, in parse_records record = self.parse(handle, do_features) File

C:\python38\lib\site-packages\Bio\GenBank\Scanner.py", line 499, in parse if self.feed(handle, consumer, do_features): File

"C:\python38\lib\site-packages\Bio\GenBank\Scanner.py", line 1801, in feed_header_lines previous_value_line = structured_comment_dict[ KeyError: 'Assembly-Data'

SeqIO biopython SeqIO.parse genbank files • 1.1k views
1
Entering edit mode

Try to install Biopython from the github source. It seems this issue is solved there, see the discussion here.

0
Entering edit mode

Can you post your code here? There is a " missing in the for seq_record in SeqIO.parse("datafile_location, "genbank"): loop but I guess that's a typo in the post as opposed to the code?