Question: Problem with Parse SeqIO SOLVED - link to cross-reference
0
gravatar for felipelira3
15 months ago by
France/Angers/IRHS
felipelira30 wrote:

Anybody have this problem before? Any suggestion about the reason?

The script creates the files containing the genome sequences but it appears at the end of the process.

Line in my script

File "/home/flira/scripts/list_ncbi_download_genome_vs_02.py", line 97, in <module>
    SeqIO.write(SeqIO.parse(genbank_file, "genbank"), genome_file, "fasta")

Warnings that appear:

  File "/usr/lib/python2.7/dist-packages/Bio/SeqIO/__init__.py", line 481, in write
    count = writer_class(fp).write_file(sequences)
  File "/usr/lib/python2.7/dist-packages/Bio/SeqIO/Interfaces.py", line 209, in write_file
    count = self.write_records(records)
  File "/usr/lib/python2.7/dist-packages/Bio/SeqIO/Interfaces.py", line 193, in write_records
    for record in records:
  File "/usr/lib/python2.7/dist-packages/Bio/SeqIO/__init__.py", line 600, in parse
    for r in i:
  File "/usr/lib/python2.7/dist-packages/Bio/GenBank/Scanner.py", line 478, in parse_records
    record = self.parse(handle, do_features)
  File "/usr/lib/python2.7/dist-packages/Bio/GenBank/Scanner.py", line 462, in parse
    if self.feed(handle, consumer, do_features):
  File "/usr/lib/python2.7/dist-packages/Bio/GenBank/Scanner.py", line 434, in feed
    self._feed_feature_table(consumer, self.parse_features(skip=False))
  File "/usr/lib/python2.7/dist-packages/Bio/GenBank/Scanner.py", line 159, in parse_features
    raise ValueError("Premature end of line during features table")

Link for the same issue in stackoverflow.com https://stackoverflow.com/questions/47792217/seqio-parse-python-premature-end-of-line-during-features-table-solved-answer-i

python • 761 views
ADD COMMENTlink modified 15 months ago • written 15 months ago by felipelira30
1

Cross-pointed on the stackoverflow

We discourage simultaneously crossposting identical question on multiple sites.

This duplicates the effort of the answerers (they can't see that a question was answered).

And it also spreads out the answers, which makes it harder to other users to track the thread.

ADD REPLYlink written 15 months ago by Sej Modha4.1k

Sorry for that but the frequency of responses here has a delay comparing with Stackoverflow and I published there too. For instance, I put the link to both topics and edited the title to solved in both.

ADD REPLYlink written 15 months ago by felipelira30
5
gravatar for a.zielezinski
15 months ago by
a.zielezinski8.6k
a.zielezinski8.6k wrote:

Philipp Bayer is right - remember to close all the files you open in the script.

This will do the trick:

from Bio import SeqIO

l = ['GCF_000302915.1_Pav631_1.0_genomic.gbff']
for genbank_file in l:
    fh = open(genbank_file)
    oh = open(genbank_file + '.fasta', 'w')
    for seq_record in SeqIO.parse(fh, 'genbank'):
        oh.write(seq_record.format('fasta'))
    oh.close()
    fh.close()
ADD COMMENTlink modified 15 months ago • written 15 months ago by a.zielezinski8.6k
4
gravatar for Philipp Bayer
15 months ago by
Philipp Bayer6.0k
Australia/Perth/UWA
Philipp Bayer6.0k wrote:

Normally this should work (and it does on my system). Are you writing to the genbank_file in the script before that? Perhaps you haven't closed the file handle yet so that writing to the file hasn't synced?

ADD COMMENTlink written 15 months ago by Philipp Bayer6.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2373 users visited in the last hour